Training budget estimation

We trained our model using AnghaBench compilation results across four optimization levels (O0~O3), selecting samples under 1024 tokens. That gave us a total of 534,564 samples per level, and we trained for 2 epochs on a cluster of 8 Nvidia A100 GPUs.

As for the training times, they were 10 hours for the 1.3B model, 85 hours for the 6.7B model, and 440 hours for the 33B model.

Let me know if you need more info!

Originally posted by @rocky-lq in https://github.com/albertan017/LLM4Decompile/issues/3#issuecomment-2002900929

Hi @rocky-lq @albertan017 ,

We are estimating the training budget of reproducing LLM4Decompile. In your previous issue response, given 534,564 samples per level and a cluster of 8 Nvidia A100 GPUs, 10 hours were cost for the 1.3B model, 85 hours were cost for the 6.7B model, and 440 hours were cost for the 33B model .

In the 19 june updated paper, fine-tuning the 1.3B and 6.7B LLM4Decompile-End takes 12 and 61 days on 8×A100 respectively given 7.2 million compilable samples and 1.6 million executable samples. There is some confusion about training budget estimation.

Would you please provide more information about training budget and are all the training are fully supervised finetuning?

albertan017 / LLM4Decompile

Training budget estimation #20