openai / sparse_attention

Examples of using sparse attention, as in "Generating Long Sequences with Sparse Transformers"
1.5k stars 191 forks source link

PyTorch Implementation #6

Open jerryyangli opened 5 years ago

jerryyangli commented 5 years ago

Is it possible to release a PyTorch implementation of the method?

AdamDanielKing commented 5 years ago

The main dependency, OpenAI's blocksparse library, has no PyTorch bindings. https://github.com/openai/blocksparse/issues/2 So that would be a starting point, unless an alternative to blocksparse has come out in the last 1.5 years.

karanchahal commented 5 years ago

Is it not possible to use this approach ? Seems pretty feasible if I'm not missing anything. Thoughts @jonasschneider @scott-gray ?