Huggingface implementation

idiap / fast-transformers

Pytorch library for fast transformer implementations

1.65k stars 179 forks source link

Huggingface implementation #33

Closed flozi00 closed 1 year ago

flozi00 commented 4 years ago

What do you think about an implementation in the huggingface transformers repo ?

angeloskath commented 4 years ago

Although I think it is a good idea, I doubt they want to depend on our custom CUDA kernels which leaves out the important parts of this library, namely CausalLinearAttention, ImprovedClusteredAttention.

bbelgodere commented 1 year ago

@angeloskath, I don't think that assumption holds true anymore, it looks like there are several HF models with custom cuda kernels implemented. See https://huggingface.co/docs/transformers/model_doc/yoso I'd be interested in helping out with this.