Open amitshermann opened 2 months ago
Hi, Yes, it is relatively simple to tweak the system for this use case. The steps that you should follow:
annotation
field empty<base_folder>/generator/dataset.csv
, and add the flag --load_dump <base_folder>
if
you should add the option custom
and set it to:
return utils.set_function_from_iterrow(lambda record: custom_score( '###User input:\n' + generation_dataset['text'] + '\n####model prediction:\n' + generation_dataset['prediction']))
where custom_score is your score function (adapt the format according to the function input)custom
, and the error_threshold
to 0.5
That's all! It should work with all these changes. If there are any issues I can help with the integration also on the discord server.
Thank you, What does the error_threshold mean? Will it make the score Boolean? Because then my custom_eval function kinda looses it's meaning. For example, I want the model to understand the different between a score of 0.8 and a score of 0.6.
Thanks in advance,
The error_threshold
determines what is the list of examples that is provided to the analyzer (we get the worst from this list). These samples are considered as samples that potentially could be improved.
You can put here very high threshold (for example 0.9), if there are too many samples it simply takes the worst from this list
Hi,
We'd like to use AutoPrompt for a generation task where both the input and output are text. We've also developed an evaluator that scores the input-output pairs (e.g., a float between 0 and 1).
Our goal is to optimize the output using our dataset and evaluator, but we're unsure how to set this up with AutoPrompt. Could you provide guidance on how to achieve this?
Thanks in advance,