netease-youdao / BCEmbedding

Netease Youdao's open-source embedding and reranker models for RAG products.
Apache License 2.0
1.3k stars 85 forks source link

bce-reranker-base_v1原生支持的passage长度问题 #53

Closed dmortem closed 2 months ago

dmortem commented 2 months ago

您好,

我关注到bce-reranker-base_v1使用的base model只支持512长度的输入(position embedding限制了长度),大于512长度则是通过“把长passage分成多个chunk,每个chunk分别求score,在取max”的形式。我担心这样的做法还是会丢失一部分长文本的原始语义,有办法让模型支持原生的passage输入超过512吗

shenlei1020 commented 2 months ago

For practice, the method to rerank long passages used in BCE-reranker is more effective and effecient than long context input model. You can test with other long context input models and share your experience if you like.