HazyResearch / safari

Convolutions for Sequence Modeling
Apache License 2.0
848 stars 70 forks source link

H3 / LongConvKernel - l_max=None isn't working #2

Closed grazder closed 9 months ago

grazder commented 1 year ago

In H3 model here noted that

https://github.com/HazyResearch/safari/blob/1ff064eb43a455e217449314dfbbd83dcac7871c/src/models/sequence/h3_conv.py#L36

But it isn't working, because of https://github.com/HazyResearch/safari/blob/1ff064eb43a455e217449314dfbbd83dcac7871c/src/models/sequence/long_conv_kernel.py#L56

Which is leading to torch.randn(int, int, None) error:

TypeError: randn(): argument 'size' must be tuple of ints, but found element of type NoneType at pos 3

So, are global kernels supported now? Or, to make it more global, should I use large l_max value?

DanFu09 commented 1 year ago

The H3Conv replaces the SSM with a fixed-length long convolution the size of the input, so you need to set the l_max.

The SSM version of H3 has no fixed context length, so that’s why it’s not necessary for that case.

On Thu, Mar 9, 2023 at 9:08 AM Alexey Korepanov @.***> wrote:

In H3 model here noted that

https://github.com/HazyResearch/safari/blob/1ff064eb43a455e217449314dfbbd83dcac7871c/src/models/sequence/h3_conv.py#L36

But it isn't working, because of https://github.com/HazyResearch/safari/blob/1ff064eb43a455e217449314dfbbd83dcac7871c/src/models/sequence/long_conv_kernel.py#L56

Which is leading to torch.randn(int, int, None) error:

TypeError: randn(): argument 'size' must be tuple of ints, but found element of type NoneType at pos 3

So, are global kernels supported now? Or, to make it more global, should I use the large l_max value?

— Reply to this email directly, view it on GitHub https://github.com/HazyResearch/safari/issues/2, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABDDIISS53RDBV3INEJGOATW3HI3HANCNFSM6AAAAAAVVDARFQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

grazder commented 1 year ago

ok thank! it would be clearer if there was an assert or something like that. this would be helpful for users like me who use just the model implementation without any other framework things

DanFu09 commented 1 year ago

Thanks, this feedback makes sense! Will add some more notes and documentation next week (currently traveling).

On Thu, Mar 9, 2023 at 4:04 PM Alexey Korepanov @.***> wrote:

ok, it would be clearer if there was an assert or something like that. this would be helpful for users like me who use just the model implementation without any other framework things

— Reply to this email directly, view it on GitHub https://github.com/HazyResearch/safari/issues/2#issuecomment-1462697255, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABDDIITA4YQFTXTWTM2FXKTW3IZWJANCNFSM6AAAAAAVVDARFQ . You are receiving this because you commented.Message ID: @.***>