Possible to continue autoregressive pre-training on custom dataset

meta-llama / codellama

Inference code for CodeLlama models

Other

15.4k stars 1.78k forks source link

Possible to continue autoregressive pre-training on custom dataset #94

Open zachschillaci27 opened 9 months ago

zachschillaci27 commented 9 months ago

Is it possible to continue the initial autoregressive pre-training on a custom dataset, as was done for Code Llama - Python? This would in principle allow for the fine-tuning of Code Llama models in other programming languages. If so, would you please provide an example training script? Any information or help would be much appreciated!

bank010 commented 9 months ago

是否可以继续在自定义数据集上进行初始自回归预训练，就像对Code Llama - Python所做的那样？原则上，这将允许在其他编程语言中微调Code Llama模型。如果是这样，请您提供一个示例训练脚本吗？任何信息或帮助将不胜感激！

Hello, have you found the right way？