Open deepfree2023 opened 1 week ago
Thanks, I've read this before use.
But the actual performance of the models seems quite different from the description.
In my tests, the -ft ones seems always produce worse results on every task compared to non-ft ones.
I have heard that one possible reason for the -ft variants being worse, is that they are finetuned to do well in benchmarks. I really don't know how true that is though.
Hi, is there any guidance on how to choose among the 4 models?