Training with float loss

SylphAI-Inc / AdalFlow

AdalFlow: The library to build & auto-optimize LLM applications.

http://adalflow.sylph.ai/

MIT License

2.24k stars 201 forks source link

Training with float loss #228

Open mrdrprofuroboros opened 1 month ago

mrdrprofuroboros commented 1 month ago

I'm trying to run prompt training with an LLMasJudge float loss alike G-Eval: 0-0.2-0.4-0.6-0.8-1 values. And the Trainer crashes since it expects the eval values to be 0 or 1

ValueError: acc_score_list should only contain 0 and 1

I'm curious to learn if there are any constraints to it or if it is generally a bad idea to use such eval/loss function? Should it be contributed and just made working or shall we (users) be educated that this is a bad idea?

liyin2015 commented 1 month ago

@mrdrprofuroboros Thanks for reporting the error. I have trained using LLM as judge for the loss, and the training was fine. Here are the code: https://github.com/SylphAI-Inc/AdalFlow/blob/main/use_cases/question_answering/bbh/word_sorting/train.py

acc_score_list in default should use [0, 0.5) to 0 and [0.5, 1] to 1. Can you share me some code snippet so that I can debug? You can share it to me privately either via my Discord or LinkedIn or a google doc via email li.yin.gravity@gmail.com

It is the right way to use the loss with the eval function that has a value in range [0, 1]!

mrdrprofuroboros commented 1 month ago

@liyin2015 it happens here during the moving batch sampling https://github.com/SylphAI-Inc/AdalFlow/blob/main/adalflow/adalflow/optim/trainer/trainer.py#L1551

you see, though before we used to compare scores with 0.5 (https://github.com/SylphAI-Inc/AdalFlow/blob/main/adalflow/adalflow/optim/trainer/trainer.py#L1523)

above we strictly require 0 and 1 which is strange