Automated Question Generation for Enhanced Learning

SiddharthKadam commented 1 month ago

Title

Team Name

Chaotic Noobs

Email

202318015@daiict.ac.in

Team Member 1 Name

Siddharth Kadam

Team Member 1 Id

202318015

Team Member 2 Name

Taruna Mati

Team Member 2 Id

202318045

Team Member 3 Name

Ananya Adarsh

Team Member 3 Id

202318027

Team Member 4 Name

Asma Narmawala

Team Member 4 Id

202318025

Problem Statement

Develop a system for Automatic Question Generation (AQG) using Natural Language Processing (NLP) techniques to generate questions from a given text. The system will aim to generate a variety of question types (e.g., who, what, where, when, why, how), focusing on the semantic and syntactic analysis of sentences to ensure relevant and grammatically correct questions. The key challenge will be to improve the system's ability to handle complex sentence structures and enhance the quality of generated questions in terms of relevance and fluency.

Evaluation Strategy

METEOR,BERT SCORE

Dataset

Dataset Name : SQuAD 1.0 Dataset Link : https://rajpurkar.github.io/SQuAD-explorer/

Resources

Paper Title - ParaQG: A System for Generating Questions and Answers from Paragraphs Paper Link - https://arxiv.org/abs/1909.01642

parth126 commented 1 month ago

Suggested Changes:

Narrow down to one paper instead of three. The choice should be one which is easier to implement and realistic (given the computational constraint).
Make sure the selected paper uses the dataset linked above (or has a dataset of its own that is publicly available).
There needs to be one objective evaluation strategy. All suggested ones are subjective and require human involvement.

Please try to answer these questions by tomorrow.

SiddharthKadam commented 1 month ago

Changes have been updated for comments 1 and 2. Regarding the evaluation, I have a doubt and have proposed the following strategy in the attached file. Let me know if you'd like further adjustments!

Eval_doubt

parth126 commented 1 month ago

Evaluation needs some more thoughts. It is possible to have very similar meaning of two questions without them having a lexical overlap. In which case your approach will penalize it unfairly. On the other end, "what is an iPhone?" and "is iPhone a what?" have perfect similarity but the latter is not useful. You should look for a more robust evaluation metric. Also the proposed paper, although good, is a demo paper so there is no evalaution conducted. Did any of the other two papers have evalution and results? If yes, then you can get some idea about how to perform evalution from those papers.

SiddharthKadam commented 1 month ago

One possible approach is to use METEOR (Metric for Evaluation of Translation with Explicit Ordering).

Referred research paper : https://link.springer.com/article/10.1007/s13748-023-00295-9

For further clarification, you can refer to the explanation video here: https://www.youtube.com/watch?v=FqQbrlEh_b0&t=421s

parth126 commented 1 month ago

METEOR is a good metric. You might also want to look at BertScore.

I think the proposal is in a good shape now. I am accepting it.

parth126 / IT550