Open wonhyeongseo opened 1 year ago
Sounds exciting!
How do you estimate the GPU resources and time required for this project?
If you go with the 7B model & you also use LoRA like we did for OctoCoder, then I think 1x A100 with 80GB or even 40GB for a few hours may easily suffice. Even for the 13B that may be enough but you may have to use a few memory reduction techniques like gradient checkpointing etc. Maybe you can even fine-tune the 34B one on a single GPU using stuff like QLoRA etc.
Can a static code analyzer's output improve the dataset?
Yes, I think it can. Check out this work where they do that: https://arxiv.org/pdf/2305.18584.pdf
Can a RLHF based approach using DPO help the model generate better code?
Yes, I think so too, check out this work doing something similar: https://arxiv.org/abs/2307.14936 What is the best way to incorporate RLHF / code feedback is still an open & interesting research question!
Hello, I am a student in Korea working on a 6 week project.
I want to fine-tune a CodeLlama model using your paper's methodology for the Code Repair task. How do you estimate the GPU resources and time required for this project?
I also have two new ideas:
Thank you for your time and guidance. Best regards, Won