microsoft / torchscale

Foundation Architecture for (M)LLMs
https://aka.ms/GeneralAI
MIT License
3.01k stars 202 forks source link

Question about learnable segment lengths and dilation rates #102

Open benrousePUC opened 6 months ago

benrousePUC commented 6 months ago

Hi there,

I would like to implement LongNet for a project that is inputting numerical data into a transformer, to predict numerical data. However, for my data there are connections between each data point in the input sequence over the entire range of the input.

This means that segment lengths and dilation rates chosen by a human user might not make sense. So I wanted to ask if there is a way of learning the best segment lengths and dilation rates, based on the connections in the input sequence that the model might find?

Many thanks.