FlagOpen / FlagEmbedding

Retrieval and Retrieval-augmented LLMs
MIT License
6.8k stars 487 forks source link

Questions about finetune_for_layerwise #819

Open aagq opened 3 months ago

aagq commented 3 months ago
  1. What are the maximum lengths of query and message when you train model bge-reranker-v2-minicpm-layerwise
  2. Why these setting in data.py? a. image b. image
545999961 commented 3 months ago

These two parameters correspond to the max lengths of the query and passage, respectively. Generally, when training a reranker, the query and passage are concatenated directly. In this way, if the combined length exceeds the specified limit, the passage will be truncated. Since there may be cases where the query is too long and the passage is too short, we trim them separately. At the same time, there may be situations where one side is long while the other is short, so we have added a little extra length to ensure that the information is sufficiently complete.