Closed tiadams closed 2 weeks ago
Now it can be used from HugginFace with the model ID "meta-llama/Meta-Llama-3-8B-Instruct".
We should test wether this performs better than MPNet (and also the time wise trade-off) and replace the MPNet default if it produces better results
Access to meta-llama/Meta-Llama-3-8B-Instruct model is restricted. Also, it is no longer labeled as "sentence similarity" but "text generation". McGill has an alternative model McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp-supervised, but we need to write a new adapter as this model does not utilize SentenceTransformer module.
We need in general a well performing model that can map from other languages (e.g. german) to the english terminologies we are using.
Doesn't need to be llama specifically but we need some sort of alternative to MPNet since it can only handle english to english.
I am looking into it. For now, we can test FremyCompany/BioLORD-2023-M as it can handle semantic similarity in a multi-lingual context. This also does not require a new adapter.
FremyCompany/BioLORD-2023-M could not handle our CDM for an unknown reason. For other models I have found that work with our adapter I re-ran the workflow I had for translated BIOFIND dictionary (145 variables). I put the old models we have tested for comparison. Here are the results:
Enligh to German | German to English | Average | |
---|---|---|---|
text-embedding-3-large model | 0.46 | 0.53 | 0.50 |
sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 | 0.19 | 0.28 | 0.24 |
sentence-transformers/distiluse-base-multilingual-cased-v1 | 0.23 | 0.23 | 0.23 |
sentence-transformers/distiluse-base-multilingual-cased-v2 | 0.15 | 0.25 | 0.20 |
sentence-transformers/all-mpnet-base-v2 | 0.12 | 0.19 | 0.16 |
FremyCompany/BioLORD-2023 | 0.14 | 0.13 | 0.14 |
Likewise, I tested the new models with our harmonization workflow:
FremyCompany/BioLORD-2023-M could not handle our CDM for an unknown reason. For other models I have found that work with our adapter I re-ran the workflow I had for translated BIOFIND dictionary (145 variables). I put the old models we have tested for comparison. Here are the results: Enligh to German German to English Average text-embedding-3-large model 0.46 0.53 0.50 sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 0.19 0.28 0.24 sentence-transformers/distiluse-base-multilingual-cased-v1 0.23 0.23 0.23 sentence-transformers/distiluse-base-multilingual-cased-v2 0.15 0.25 0.20 sentence-transformers/all-mpnet-base-v2 0.12 0.19 0.16 FremyCompany/BioLORD-2023 0.14 0.13 0.14
Likewise, I tested the new models with our harmonization workflow: Average text-embedding-3-large model 0.77 FremyCompany/BioLORD-2023 0.76 sentence-transformers/all-mpnet-base-v2 0.73 sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 0.72 sentence-transformers/distiluse-base-multilingual-cased-v1 0.61 sentence-transformers/distiluse-base-multilingual-cased-v2 0.60
Thanks for the research, great work. I think based in this we should probably switch to the text-embedding-3-large model since it appears to perform best in both metrics
Just saw this is openAI based, probably best trade off then is sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2, what do you think?
Now it can be used from HugginFace with the model ID "meta-llama/Meta-Llama-3-8B-Instruct".