Open NeXuSuss opened 2 months ago
New Problem Statement
Title Enhance Information Retrieval and Summarization Accuracy in Biomedical Literature Systems
Category Reproducibility
Problem Statement We are facing challenges in efficiently retrieving relevant information, summarizing content, and providing accurate answers from large biomedical datasets like CORD-19, which contains over 400,000 articles. Due to its unstructured nature, current system fails to understand complex biomedical terms which results in poor search.
To address this issue, we need to develop a more efficient system that increases information retrieval , summarization and question-answering capabilities in the dataset.
Evaluation Strategy Metrics: Precision: Measures whether the top-ranked sentence contains the relevant answer. Recall: Evaluates the presence of relevant sentences within the top three ranked results. Mean Reciprocal Rank (MRR): Assesses the average reciprocal rank of the first relevant sentence across all queries. Summarization F1 Score: Balances precision and recall for the generated summaries against reference summaries. QA Accuracy: Determines the correctness of the answers provided by the QA system compared to ground truth answers.
Dataset: CORD-19 Dataset: https://www.kaggle.com/datasets/allen-institute-for-ai/CORD-19-research-challenge CovidQA Dataset: https://www.kaggle.com/datasets/xhlulu/covidqa Biomedical Question Answering Dataset: https://arxiv.org/abs/1804.07409
Resources: [1] https://link.springer.com/chapter/10.1007/978-3-031-35320-8_29
Looks good. The team is responsible for ensuring they dont hit a blocker due to lack of compute, especially with the T5 Large model
Title
IntelliHelp: RAG-Powered Customer Assistance Using LLM
Team Name
Run Time Errorists
Email
202311048@daiict.ac.in
Team Member 1 Name
Shyam Saktawat
Team Member 1 Id
202311048
Team Member 2 Name
Abhishek Choudhary
Team Member 2 Id
202311067
Team Member 3 Name
Ayush Kumar Sahu
Team Member 3 Id
202311066
Team Member 4 Name
NIL
Team Member 4 Id
NIL
Category
New Research Problem
Problem Statement
Current organizational chatbots struggle to autonomously resolve user queries, as they fail to effectively integrate RAG and LLMs with proprietary data. IntelliHelp aims to offer personalized, real-time customer support by retrieving relevant information and generating dynamic, accurate responses without human intervention.
Evaluation Strategy
Dataset
https://github.com/unicamp-dl/retailGPT/tree/main/retailGPT/datasets
Resources
[1] LEWIS, Patrick et al. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. arXiv, 2020. [2] XU, Ziwei; JAIN, Sanjay; KANKANHALLI, Mohan. Hallucination is Inevitable: An Innate Limitation of Large Language Models. arXiv, 2024. [3] WEI, Jason; WANG, Xuezhi; SCHUURMANS, Dale et al. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. arXiv, 2022.