bigcode-project / octopack

🐙 OctoPack: Instruction Tuning Code Large Language Models
https://arxiv.org/abs/2308.07124
MIT License
431 stars 27 forks source link

Fine-tuning a CodeLlama model on CommitPackFT #22

Open wonhyeongseo opened 1 year ago

wonhyeongseo commented 1 year ago

Hello, I am a student in Korea working on a 6 week project.

I want to fine-tune a CodeLlama model using your paper's methodology for the Code Repair task. How do you estimate the GPU resources and time required for this project?

I also have two new ideas:

Thank you for your time and guidance. Best regards, Won

Muennighoff commented 1 year ago

Sounds exciting!

How do you estimate the GPU resources and time required for this project?

If you go with the 7B model & you also use LoRA like we did for OctoCoder, then I think 1x A100 with 80GB or even 40GB for a few hours may easily suffice. Even for the 13B that may be enough but you may have to use a few memory reduction techniques like gradient checkpointing etc. Maybe you can even fine-tune the 34B one on a single GPU using stuff like QLoRA etc.

Can a static code analyzer's output improve the dataset?

Yes, I think it can. Check out this work where they do that: https://arxiv.org/pdf/2305.18584.pdf

Can a RLHF based approach using DPO help the model generate better code?

Yes, I think so too, check out this work doing something similar: https://arxiv.org/abs/2307.14936 What is the best way to incorporate RLHF / code feedback is still an open & interesting research question!