HazyResearch / hyena-dna

Official implementation for HyenaDNA, a long-range genomic foundation model built with Hyena
https://arxiv.org/abs/2306.15794
Apache License 2.0
574 stars 82 forks source link

need to swap layer norm op for triton-based layer norm? #57

Open ankitvgupta opened 6 months ago

ankitvgupta commented 6 months ago

In the Flash-attention repo here, there is now a note that the fused CUDA op has been replaced with a Triton op.

in light of that, is it now reasonable to remove from the dependencies section of this readme the suggestion to pip install the layer norm op?

ankitvgupta commented 6 months ago

It looks like on this line, you check if the custom layer norm op is installed. if so, this param is set to true. Following the call stack, that sets this param in the Flash-Attention package. That implementation here has moved to a Triton implementation.

However, later in the original hyena-DNA code, we are using the non-Triton function. Does that need to be swapped out?

relevant PR: https://github.com/Dao-AILab/flash-attention/commit/abbc1311731867310635f9edc2a9ec18317c8c48

ankitvgupta commented 6 months ago

In case the answer is yes, think this should do it: https://github.com/HazyResearch/hyena-dna/pull/58