Open gian-g3dai opened 1 year ago
Please try the upgraded version of llama-adapter-v2 repo at x-accessory which supports the fullfinetune and peft of llama2 and InternLM.
https://llama2-accessory.readthedocs.io/en/latest/finetune/index.html
Thank you very much @gaopengpjlab I am working on the accessory now. Just out of curiosity - is it possible in principle to pretrain llama2? I am wondering if some things, such as tokenizer and params are not the same and might lead to errors.
With X-Accessory repo, you can pretrain llama2 from scratch with refinedweb dataset. We use the original LLaMa tokenizer and model configuration.
Ok, thank you @gaopengpjlab, so in case I would like to pretrain/finetune llama2 on multimoldal tasks you would suggest to ue X-Accessory, right? Is it possible to do so on code-llama too? I know that there are no scripts to do so yet in the repo, but in principle should be possible, right?
Hi!
Thank you very much @ChrisLiu6, great explanation!!
What would you suggest for implementing a mulitmodal code-llama then? Should I start from the X-accessory and take it from there? I could make some scripts to launch the pretrain and so on (I have compute so shouldn't be a problem).
As of starting from the checkpoint, I have tried it but looks like the model keeps a certain "bias" and prefers to produce short outputs. Together with the fact that I would like to have a more custom pretrained model I think it's not the best option for me.
@ChrisLiu6 @gaopengpjlab, any tips? Otherwise thank you for the time so far, will close the issue soon.
Thank you very much @ChrisLiu6, great explanation!!
What would you suggest for implementing a mulitmodal code-llama then? Should I start from the X-accessory and take it from there? I could make some scripts to launch the pretrain and so on (I have compute so shouldn't be a problem).
As of starting from the checkpoint, I have tried it but looks like the model keeps a certain "bias" and prefers to produce short outputs. Together with the fact that I would like to have a more custom pretrained model I think it's not the best option for me.
Sorry, I just missed it.
To run stage-one fine-tuning with codellama, one way is to use this repo (LLaMA-Adapter), but modify the implementation of llama to support llama2 and codellam features (especially rope_theta
). You may refer to our implementation in X-Accessory.
Alternatively, if you want to use X-Accessory, you may follow the fine-tuning pipeline, with the following data config:
META:
-
path: path/to/your/data.csv
type: 'image_text'
preprocess: 'caption'
prompt_type: caption'
the preprocess
parameter works here:
and the prompt_type
parameter works here:
Other configurations should be similar to this experiment. Note that you also need to re-write which model parameters are trainable for this stage: https://github.com/Alpha-VLLM/LLaMA2-Accessory/blob/c7fd8f83d3564e0982c63e8e0a1c8930b30c6cfe/accessory/model/LLM/llama.py#L332
Thank you very much @ChrisLiu6!! Couldn't have wished for a better explanation!!
Hello, thank you for the work you are doing.
Does llama-adapter-v2 support llama2 or is it only working with llama? I am able to pretrain with the weights of llama2 but the inference results do not make much sense.
And fine-tuning fails.