dionis / SpanishMedicaLLM

An Open Source Medical Context Large Language Model (LLM) for Q&A and Prompt in Spanish Using Fine-Tuning Techniques with QLora and Epfl with Low Compute Resources. Inspired on Meditron as a suite of open-source medical Large Language Models (LLMs).
https://huggingface.co/epfl-llm
Apache License 2.0
0 stars 0 forks source link

Build a corpus for LLM training in Spanish #16

Open dionis opened 6 months ago

dionis commented 6 months ago

Taking as reference the corpora used for the construction of Meditron, create a corpus with the same characteristics for training an LLM model.

Sources to consult: https://github.com/PlanTL-GOB-ES/lm-biomedical-clinical-es?tab=readme-ov-file

Expected results:

A medical corpus in Spanish that can be used as input for self-tuning or training of an LLM model.