Aidenzich / road-to-master

A repo to store our research footprint on AI
MIT License
17 stars 5 forks source link

2024-03 Latest Health LLM #41

Open Aidenzich opened 3 months ago

Aidenzich commented 3 months ago
Aidenzich commented 3 months ago

LLaVA-Med

Why (Motivation) What (Action/Method) How (Process/Implementation)
Multimodal conversational AI has progressed rapidly, but general-domain vision-language models lack sophistication in biomedical contexts. Proposed LLaVA-Med, a vision-language conversational assistant for biomedical images. Leveraged a large-scale biomedical figure-caption dataset from PubMed Central and used GPT-4 to self-instruct open-ended instruction-following data from captions.
Biomedical image-text pairs differ significantly from general web content, leading to inefficiencies in general-domain visual assistants for biomedical applications. Created a novel data generation pipeline for biomedical multimodal instruction-following data. Generated diverse (image, instruction, output) instances by sampling image-text pairs and using GPT-4 to create instructions from the text, requiring zero manual annotations.
The need for a cost-efficient approach to empower biomedical practitioners with AI that can answer open-ended research questions about biomedical images. Developed LLaVA-Med using a novel curriculum learning method for adapting to the biomedical domain. Initially fine-tuned to align biomedical vocabulary using image-text pairs, then further trained on self-generated instruction-following data to master open-ended conversational semantics.
Aidenzich commented 3 months ago

MEDITRON

Why (Motivation) What (Action/Method) How (Process/Implementation)
Access to medical knowledge is uneven, and LLMs have the potential to democratize this by improving the quality of medical decision-making. Developed MEDITRON, a suite of LLMs with 7B and 70B parameters specifically adapted to the medical domain. Used the Llama-2 model as a basis and extended pretraining on a curated medical corpus, including PubMed articles, abstracts, and medical guidelines, utilizing Nvidia’s Megatron-LM distributed trainer.
Existing medical LLMs are either closed-source or limited in scale, hindering their ability to perform complex reasoning tasks in the medical field. Aimed to create an open-source, large-scale medical LLM that can perform high-level medical reasoning and match or outperform state-of-the-art baselines. Released MEDITRON models with and without fine-tuning, along with the code for curating the medical pretraining corpus and a distributed training library, to facilitate further research and development in the medical LLM space.
There's a gap in the availability of high-quality, domain-specific LLMs that can tackle medical reasoning tasks effectively. Evaluated MEDITRON's performance on major medical benchmarks to validate its efficacy in medical reasoning and knowledge recall. Employed advanced prompting strategies like chain-of-thought and self-consistency, and tested the models on a set of medical reasoning benchmarks, demonstrating significant performance gains over several baselines.