Adding Benchmarking Script

ivanleomk commented 8 months ago

Adding a slightly messy benchmarking script to use in order to benchmark cohere against an open source model.

	:rocket: This PR description was created by Ellipsis for commit 997793fced4c36ecd7a2a9c3916694ceedc004d0.

Summary:

This PR adds a benchmarking script to compare the performance of the 'cohere' model against an open-source model, and includes updates to the Readme file and new helper files.

Key points:

Added a new benchmarking script embed.py.
The script includes functions to download the model, determine if the model has generated embeddings, and embed the dataset.
The script also includes a function to generate embeddings and validate the dataset.
Modified the Readme.md file to reflect these changes.
Added a new file finetune.py for fine-tuning the model.
Added new helper files cache.py and models.py.
Added new files process.py and visualise.py for processing and visualising the results.

Generated with :heart: by ellipsis.dev

jxnl commented 8 months ago

and lets not call out cohere directly in the plot just say "closed source embeddings"

ivanleomk commented 8 months ago

I obtained the following results when I ran the script

Model Name	AUC
sentence-transformers/gtr-t5-large	0.93892
embed-multilingual-v3.0	0.938904
llmrails/ember-v1	0.937499
infgrad/stella-base-en-v2	0.934832
BAAI/bge-base-en-v1.5	0.931893
thenlper/gte-large	0.93085
text-embeddings-ada-v2	0.928656

567-labs / fastllm