[Feature Request]: Can I check the loss when finetuning Embedding model?

torvalds-dev commented 10 months ago

Feature Description

I was using finetune_embedding.ipynb finetuning an Embedding model, but found no where to return loss curve, it should include train loss and val loss. So how can I get them? Thanks.

Reason

No response

Value of Feature

No response

torvalds-dev commented 10 months ago

torvalds.dev is analyzing the ticket

torvalds-dev commented 10 months ago

The user is requesting a feature to check the loss when fine-tuning an Embedding model. They were using the finetune_embedding.ipynb notebook but found no way to return a loss curve, which should include both training loss and validation loss. They are asking how they can get these values.

The relevant files to this issue are:

llama_index/embeddings/__init__.py: This file contains the initialization of various embedding models. The user might be using one of these models for fine-tuning.
llama_index/finetuning/gradient/base.py: This file contains the GradientFinetuneEngine class, which is used for fine-tuning models. It might be possible to modify this class to return the loss values during fine-tuning.
llama_index/token_counter/mock_embed_model.py: This file contains a mock embedding model. It's unclear if this is directly related to the user's issue, but it might be useful for testing any changes.
llama_index/callbacks/finetuning_handler.py: This file contains callback handlers for fine-tuning, including OpenAIFineTuningHandler and GradientAIFineTuningHandler. These handlers might be able to be modified to track and return loss values during fine-tuning.

Based on these files, the following actions could be taken:

Investigate the GradientFinetuneEngine class in llama_index/finetuning/gradient/base.py to see if it's possible to modify it to return loss values during fine-tuning.
Look into the callback handlers in llama_index/callbacks/finetuning_handler.py to see if they can be modified to track and return loss values.
If necessary, create a new callback handler specifically for tracking loss during fine-tuning.
Test any changes using the mock embedding model in llama_index/token_counter/mock_embed_model.py.
Update the finetune_embedding.ipynb notebook to demonstrate how to use the new feature.

torvalds-dev / llama_index