Open NimaBoscarino opened 2 years ago
I was also thinking about this, but have not yet find a good solution how to infer this from training automatically. Which similarity functions are usable can depend on quite many factors.
But it could be added by hand to the config and if it exist, be used in downstream functions.
Adding it by hand and reading it in downstream functions sounds like a great solution to me for now! I can open a PR for that.
SentenceTransformer models can have an associated
config_sentence_transformers.json
file (see this one for example) which contains information about the versions that were used when saving the model. It would be nice to extend the use of this file.Since the ideal scoring function (dot vs. cos_sim) is known at training time, that could be written into the
config_sentence_transformers.json
file. Then, some of the util methods (semantic_search
,paraphrase_mining
, etc.) could read directly from that file to choose the appropriate scoring function.