Closed pkubgmlixs closed 1 month ago
Hi,
the maximum supported length during pre-training and fine-tuning was set to 1022.
Thank you for your support, retiro. I would like to use the pre-trained language model you've developed for a specific fine-tuning task. Does this mean that it can only process sequences of up to 1022 tokens? Does the limitation in your training process also apply to the downstream applications? Thanks!
We use Rotary Positional Embedding (RoPE) instead of learned positional embeddings so RiNALMo is able to process sequences of any length (as long as you have a GPU with enough memory).
Thank you, RJPenic!
Hi there, I wonder is there any limitation you set on the maximum supported length of input sequence when you pre-trained the LLM and fine-tuned for specific task? I haven't dig into you code yet. Thanks!