clinicalml / TabLLM

MIT License
265 stars 42 forks source link

How can we use Llama2 here? #10

Open shivprasad94 opened 1 year ago

shivprasad94 commented 1 year ago

I see from the code repo that we are using OpenAI APIs, how can we make this work for Open source models like Llama2? can someone give me detail on this and what steps I need to follow?

stefanhgm commented 11 months ago

Hello @shivprasad94,

sorry for the late reply and thanks for reaching out!

TabLLM is LLM agnostic, so you can use whatever LLM you want. For instance, to use another HuggingFace model you could create a new json config in TabLLM/t-few/configs (e.g. llama.json) and use the model specifier for the original_model parameter (e.g., "origin_model": "meta-llama/Llama-2-7b").

You can then use this model configuration in the run configuration few-shot-pretrained-100k.sh in line 18 as for model in 'llama'..

Let us know if you need any further help!

RyanJJP commented 4 months ago

There seems to be something wrong with t-few when finetune since LLaMA is not an EncoderDecoder model

stefanhgm commented 4 months ago

Hello @RyanJJP,

thanks for this additional comment. You are right, t-few might not work with LLaMA. However, other fine-tuning methods for LLaMA (e.g. QLoRA) should allow a similar functionality. This would require larger changes to the code basis but conceptually it should be similar.