Open mjohal3 opened 2 weeks ago
Hey @mjohal3! I'm here to help you with any bugs, questions, or contributions you have. Let's work through this together!
To resolve the ValueError
related to args.eval_strategy
when the evaluation dataset is passed as None
, ensure that the evaluator
is set to None
when no validation dataset is provided. The SentenceTransformersFinetuneEngine
class handles this by setting evaluator
to None
if val_dataset
is None
. Make sure you are not passing any evaluation-related arguments that require an evaluator when val_dataset
is None
.
Here's the relevant part of the code that handles this:
evaluator: Optional[InformationRetrievalEvaluator] = None
if val_dataset is not None:
evaluator = InformationRetrievalEvaluator(
val_dataset.queries, val_dataset.corpus, val_dataset.relevant_docs
)
self.evaluator = evaluator
When calling the finetune
method, do not pass parameters that depend on the evaluator being present if val_dataset
is None
. The finetune
method will handle the absence of an evaluator correctly as long as no evaluation steps are required [1].
To continue talking to Dosu, mention @dosu.
Bug Description
On llama-index 0.11.22 and llama-index-finetuning 0.2.1. I was attempting to follow the documentation to finetune the BAAI/bge-small-en-v1.5 model on my own dataset. I attempted two different approaches that were present in the llama-index documentation: Approach 1:
Approach 2:
However in both cases, I get the following error
ValueError: You have set
args.eval_strategyto steps but you didn't pass an
eval_datasetto
Trainer. Either set
args.eval_strategyto
noor pass an
eval_dataset.
Looking at the source code I see that in Sentence-Transformers fit-mixin eval dataset is passed in as None which may cause the issue. I double checked the format of my train and val jsons, and those are correct. I do want to perform the finetuning with validation, so any help would be much appreciated!
Version
0.11.22
Steps to Reproduce
Run either of the above approaches and the error should occur. Also attempting to run the code present in the documentation here and here should result in the same error.
Relevant Logs/Tracbacks