-
### 结论
选型建议:
1. 大部分模型的序列长度是 512 tokens。 8192 可尝试 tao-8k,1024 可尝试 stella。
2. 在专业数据领域上,嵌入模型的表现不如 BM25,但是微调可以大大提升效果。
3. 有微调需求且对模型训练了解较少的,建议选择 bge 系列(完善的训练脚本、负例挖掘等)。但多数模型都基于BERT,训练脚本也通用,其他模型也可以参考。…
-
I want to apply reranking for the MSMT17 dataset but the algorithm consumes too much memory space. Is there a way to consume less memory by achieving similar results? I have limited resources availabl…
-
Hello! your reranking is a good job! However, there is a problem about your usage of q_q_dist. Without q_q_dist, it seems that your reranking would drop a lot. I do the experiments below
| mAP of No…
-
**Is your feature request related to a problem?**
We're trying to put a bunch of local model types in ml-commons (#1164). One such type is a [cross-encoder](https://www.sbert.net/examples/application…
-
Why don't you fine-tune the cross encoder for the tool retrieval task? Have you tried it? I have tried fine-tuning the cross encoder for the tool reranking task, but it performs very poorly, mostly re…
-
```
Disclaimer:
The exact implementation details of the proposal are for the team to review. The purpose of the proposal is to pitch an idea which closes the data->qna->knowledge->finetune loop. I w…
-
### Self Checks
- [X] I have searched for existing issues [search for existing issues](https://github.com/langgenius/dify/issues), including closed ones.
- [X] I confirm that I am using English to su…
-
# Cross-Encoders as reranking
## WHAT
Reranking retrieved list of candidates from Weaviate based on cross-encoder model scores
## WHY
Because combining cross-encoders with bi-encoders can impr…
-
**Is your feature request related to a problem? Please describe.**
We want to be able to re-rank and/or post process the results obtained through external search.
**Describe the solution you'd li…
-
Thanks for your great job!
Would you please show me the performance of OSNet with reranking ? As you konw , in many cases , the rank1 and mAP will be improved with reranking, how about OSNet? I can…