Open mrdrprofuroboros opened 1 month ago
@mrdrprofuroboros Thanks for reporting the error. I have trained using LLM as judge for the loss, and the training was fine. Here are the code: https://github.com/SylphAI-Inc/AdalFlow/blob/main/use_cases/question_answering/bbh/word_sorting/train.py
acc_score_list in default should use [0, 0.5) to 0 and [0.5, 1] to 1. Can you share me some code snippet so that I can debug? You can share it to me privately either via my Discord or LinkedIn or a google doc via email li.yin.gravity@gmail.com
It is the right way to use the loss with the eval function that has a value in range [0, 1]!
@liyin2015 it happens here during the moving batch sampling https://github.com/SylphAI-Inc/AdalFlow/blob/main/adalflow/adalflow/optim/trainer/trainer.py#L1551
you see, though before we used to compare scores with 0.5 (https://github.com/SylphAI-Inc/AdalFlow/blob/main/adalflow/adalflow/optim/trainer/trainer.py#L1523)
above we strictly require 0 and 1 which is strange
I'm trying to run prompt training with an LLMasJudge float loss alike G-Eval: 0-0.2-0.4-0.6-0.8-1 values. And the Trainer crashes since it expects the eval values to be 0 or 1
I'm curious to learn if there are any constraints to it or if it is generally a bad idea to use such eval/loss function? Should it be contributed and just made working or shall we (users) be educated that this is a bad idea?