Enhance Information Retrieval and Summarization Accuracy in Biomedical Literature Systems

NeXuSuss commented 2 months ago

Title

IntelliHelp: RAG-Powered Customer Assistance Using LLM

Team Name

Run Time Errorists

Email

202311048@daiict.ac.in

Team Member 1 Name

Shyam Saktawat

Team Member 1 Id

202311048

Team Member 2 Name

Abhishek Choudhary

Team Member 2 Id

202311067

Team Member 3 Name

Ayush Kumar Sahu

Team Member 3 Id

202311066

Team Member 4 Name

NIL

Team Member 4 Id

NIL

Problem Statement

Current organizational chatbots struggle to autonomously resolve user queries, as they fail to effectively integrate RAG and LLMs with proprietary data. IntelliHelp aims to offer personalized, real-time customer support by retrieving relevant information and generating dynamic, accurate responses without human intervention.

Evaluation Strategy

Response Time Metrics: Average response time per query.
Engagement and Retention Metrics: User retention rate, average session duration, and number of returning users.
User Satisfaction Metrics: Post-interaction surveys, Net Promoter Score (NPS), and Customer Satisfaction Score (CSAT).

Dataset

https://github.com/unicamp-dl/retailGPT/tree/main/retailGPT/datasets

Resources

[1] LEWIS, Patrick et al. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. arXiv, 2020. [2] XU, Ziwei; JAIN, Sanjay; KANKANHALLI, Mohan. Hallucination is Inevitable: An Innate Limitation of Large Language Models. arXiv, 2024. [3] WEI, Jason; WANG, Xuezhi; SCHUURMANS, Dale et al. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. arXiv, 2022.

parth126 commented 2 months ago

Problem not well defined.
Dataset not relevant

NeXuSuss commented 1 month ago

New Problem Statement

Title Enhance Information Retrieval and Summarization Accuracy in Biomedical Literature Systems

Category Reproducibility

Problem Statement We are facing challenges in efficiently retrieving relevant information, summarizing content, and providing accurate answers from large biomedical datasets like CORD-19, which contains over 400,000 articles. Due to its unstructured nature, current system fails to understand complex biomedical terms which results in poor search.

To address this issue, we need to develop a more efficient system that increases information retrieval , summarization and question-answering capabilities in the dataset.

Evaluation Strategy Metrics: Precision: Measures whether the top-ranked sentence contains the relevant answer. Recall: Evaluates the presence of relevant sentences within the top three ranked results. Mean Reciprocal Rank (MRR): Assesses the average reciprocal rank of the first relevant sentence across all queries. Summarization F1 Score: Balances precision and recall for the generated summaries against reference summaries. QA Accuracy: Determines the correctness of the answers provided by the QA system compared to ground truth answers.

Dataset: CORD-19 Dataset: https://www.kaggle.com/datasets/allen-institute-for-ai/CORD-19-research-challenge CovidQA Dataset: https://www.kaggle.com/datasets/xhlulu/covidqa Biomedical Question Answering Dataset: https://arxiv.org/abs/1804.07409

Resources: [1] https://link.springer.com/chapter/10.1007/978-3-031-35320-8_29

parth126 commented 1 month ago

Looks good. The team is responsible for ensuring they dont hit a blocker due to lack of compute, especially with the T5 Large model

parth126 / IT550