GFNOrg / gfn-lm-tuning

MIT License
119 stars 20 forks source link

Question about loss selection #6

Open Yu-Fangxu opened 5 months ago

Yu-Fangxu commented 5 months ago

Hi the authors, I learned a lot from your wonderful work. I want to ask if you have ever tried Trajectory Balance(TB) as the learning objective. I used to run your code and found that the loss starts from 10^3 to 10^4, which is large. When considering TB, the loss should not be that large, because XYZ should be consistent. thus leading to a large P(XZY). So did you try TB at first? If there is any default when training with TB?

Thanks!

MJ10 commented 5 months ago

Hi @Yu-Fangxu, we did try trajectory balance at some point early on in the project but haven't tried it since. All the experiments were run with the modified SubTB loss, so unfortunately I can't help with good defaults. Regarding the loss - if you are initializing new LoRA weights, the initial loss can be quite high - in some other projects we have observed that for short sequences TB does work well. Hope this helps!