Closed leoozy closed 2 years ago
Hi,
This is because all the datasets we used mostly have sequence lengths lower than 32. For efficient reason, we set the max length as 32.
Hello, Actually we can modify the max_seq_len. I tried configuring with Transformers. But I got CUDA out of memory with sentence_transformers when I tried to increase the max_seq_len
If you encountered CUDA out of memory, you should decrease the sequence length or batch size.
Hi,
This is because all the datasets we used mostly have sequence lengths lower than 32. For efficient reason, we set the max length as 32.
For the unsupervised version, the training max_seq_len is still confined to a certain number as 32 for the infer stage? If not, is there any truncated strategy for the extreme long queries appeared in the wiki train dataset .
If the sentence is longer than the set max length, it will be truncated.
If you encountered CUDA out of memory, you should decrease the sequence length or batch size.
Yes you are right. But in my case, max len 256 for input is the main requirement so I cannot decrease it. For the batch size, I tried with 1 similar pair per batch but It still got the issue
If you encountered CUDA out of memory, you should decrease the sequence length or batch size.
Yes you are right. But in my case, max len 256 for input is the main requirement so I cannot decrease it. For the batch size, I tried with 1 similar pair per batch but It still got the issue
pick a smaller model
@TrieuLe0801 Yeah you should consider using a smaller model or a larger GPU.
Hello, I noticed that the max_sequence_length in your code is set to 32. But the number of tokens of most of sentences in Eng WIKI exceed 32. Why the max sequence_length is 32? Thank you