In the LLM2Vec paper, only 1000 steps were trained when learning MNTP and SimCSE. Will learning more steps result in higher performance? Or is the performance improvement minimal? Also, since it is an English-based model when learning other languages, do you expect it to help performance when learning more steps than 1000 steps? Your answer would be of great help!
hello!
In the LLM2Vec paper, only 1000 steps were trained when learning MNTP and SimCSE. Will learning more steps result in higher performance? Or is the performance improvement minimal? Also, since it is an English-based model when learning other languages, do you expect it to help performance when learning more steps than 1000 steps? Your answer would be of great help!