Maluuba / newsqa

Tools for using Maluuba's NewsQA Dataset (public version)
https://www.microsoft.com/en-us/research/project/newsqa-dataset/
Other
252 stars 59 forks source link

What's Next??? #33

Open mady143 opened 4 years ago

mady143 commented 4 years ago

Hi @temporaer @ksuleman @tsendeemts @MarcCote @kracwarlock , I had run your code as per the instructions given in the README.md file i.e, python maluuba/newsqa/data_generator.py python -m unittest discover

Now I want to know the next step i.e, giving an input text file to get the questions and answers or else any other alternative to qeting the questions with answers .

Thanks and Regards, Manikantha Sekhar

Happy Codding...

juharris commented 4 years ago

Thanks for your interest. The purpose of this repository is mainly to build and use the NewsQa dataset. I'm not sure of the best place to find our question answering/generation models. Maybe @xingdi-eric-yuan, @wangtong106, or @trischler can also help or at least link to relevant papers.

mady143 commented 4 years ago

Hi @juharris , Thanks for reply actually i am also searching for automatic generation of questions and answers from the given document so can you help in this point ...

Thank you...

xingdi-eric-yuan commented 4 years ago

Hi @mady143 , Usually people either generate answers given (document, question), or question given (document, answer). If you want to generate them both, maybe first look into some keyphrase generation/extraction work. Conditioning on a document and its keyphrases, questions can be generated.

Both keyphrase generation and question generation have a quite large community. I believe google will return you quite a few papers in these topics.

mady143 commented 4 years ago

Hi @xingdi-eric-yuan , Actually at first i tried using TextBlob package in python its very really helpful for me in generation of questions and answers as well as multiple choice and fill in the blanks generations. but i am facing a challenge that i was unable to increase the complexity of the question which i was generating.that means its giving very simple questions followed by simple answers which is easily identify by the end user so could you help in this perspective....

Thanks and Regards, Manikantha Sekhar...

xingdi-eric-yuan commented 4 years ago

To me, complex question generation is a way of evaluating machine comprehension (both for humans and machines). Therefore I don't think auto question generation is a solved problem. As you said, most work generates simple questions (as simple as paraphrasing a sentence in document as question). The main reason is because machines only comprehend language in a shallow manner.

There are lines of work trying to increase the complexity. For instance, some work combine multiple questions to be a "multi-hop" question (e.g., HotpotQA); some work ask followup questions from a dialog style document (e.g., ShARC). Furthermore, think broadly, one can convert QA task into other NLP tasks like summarization, by asking questions like "what does this passage mainly say?", this will definitely increase the difficulty level.