hitvoice / DrQA

A pytorch implementation of Reading Wikipedia to Answer Open-Domain Questions.
401 stars 109 forks source link

Does it requires GPU acceleration #11

Closed augmen closed 6 years ago

augmen commented 6 years ago

hi Does it requires GPU acceleration? like pytorch GPU version ? can we develop it to use CPUs ? how many cores and ram is required to run it ?

hitvoice commented 6 years ago

This project is meant to be CPU-compatible but it will be very slow to train a new model using only CPUs. The model is small so 4G RAM will be sufficient.

augmen commented 6 years ago

GPU are bit costly to run. How about 16 GB RAM and 8 CPU cores.

hitvoice commented 6 years ago

It is possible to train a model but it may take several days. I personally recommend renting GPU servers from paperspace or Amazon AWS if your budget is limited.

augmen commented 6 years ago

ok done paperspace . how about developing a production level application ? is it production ready ?

augmen commented 6 years ago

iguess ht model is not trained on wikipedia ? how about adding a new QA data for training and production grade release ? what formats are to be used ?

hitvoice commented 6 years ago

The same format as SQuAD will be fine. It is not production ready, but it will be very easy to transform "interact.py" to a web service, embed the model in a larger system and so on.

augmen commented 6 years ago

or what are the changes can be made to DrQA for production purpose ? i mean how to transform "interact.py" to a web service, embed the model to mobile app ? can we add the paragraph retriever functionality also ? can it handle large no. of Queries like 1000 / second ?

hitvoice commented 6 years ago

It's beyond the scope of this project. If your demand is beyond what you can handle, please hire someone who can to work for you.

augmen commented 6 years ago

any suggestions whom to hire / where to find the resourceful people ?

hitvoice commented 6 years ago

This is largely depending on where you live. My experience only applies to talent hunting in mainland China and may not be helpful in other countries.

augmen commented 6 years ago

ok

augmen commented 6 years ago

so you have used single.mdl model ? thanks for the guidance

hitvoice commented 6 years ago

What is single.mdl?

augmen commented 6 years ago

would you like to contribute to the project. we can pay you if you want. single / multitask.mdl are models for the predictions i guess

hitvoice commented 6 years ago

I have a full-time job and I'm afraid I won't have time for this. I'll close this issue.

niimi1996 commented 5 years ago

Can you please tell me what we to do in order to generate long answers

hitvoice commented 5 years ago

Hi @niimi1996, There's no easy way to do this. You can add some limits to the decoding process, for example, filtering out answers not longer than 3 words and choose the remaining highest-ranked one, but it'll affect the quality of answers. Training on datasets with long answers will help, but custom datasets are costly.