Fine-tuning a CodeLlama model on CommitPackFT

Sounds exciting!

How do you estimate the GPU resources and time required for this project?

If you go with the 7B model & you also use LoRA like we did for OctoCoder, then I think 1x A100 with 80GB or even 40GB for a few hours may easily suffice. Even for the 13B that may be enough but you may have to use a few memory reduction techniques like gradient checkpointing etc. Maybe you can even fine-tune the 34B one on a single GPU using stuff like QLoRA etc.

Can a static code analyzer's output improve the dataset?

Yes, I think it can. Check out this work where they do that: https://arxiv.org/pdf/2305.18584.pdf

Can a RLHF based approach using DPO help the model generate better code?

Yes, I think so too, check out this work doing something similar: https://arxiv.org/abs/2307.14936 What is the best way to incorporate RLHF / code feedback is still an open & interesting research question!

bigcode-project / octopack

Fine-tuning a CodeLlama model on CommitPackFT #22