openai / blocksparse

Efficient GPU kernels for block-sparse matrix multiplication and convolution
https://blog.openai.com/block-sparse-gpu-kernels/
MIT License
1.02k stars 200 forks source link

Upgrade to tensorflow 2.0 #42

Open georgepar opened 5 years ago

georgepar commented 5 years ago

Hi all,

Has anybody tried to upgrade this project to tensorflow 2.0?

AFAIK one of the main issues is that cuda_stream.h header was removed in TF 2.0 (also see #40 ). Now instead of passing CUstream directly when writing an op, users must pass a GPUDevice object (probably to uncouple from CUDA dependency).

Tried to patch with this change but failed. Have others had any luck?

linzwatt commented 5 years ago

I have tried to build this against TF 1.14 with no success, I recall an issue related to cuda_stream.h, and some others. I don't know enough about TF to fix these myself.

I would certainly like to see this library updated to the latest TF and cuda 10, and even ported to pytorch if possible. There are many interesting applications for bsmm that I am very keen to try

scott-gray commented 5 years ago

This issue is currently blocking 1.14 support:

https://github.com/tensorflow/tensorflow/issues/31349

Otherwise, I can fix the code that grabs the cu_stream to the new way (it is stupidly awkward to get a hold of this handle in tensorflow).

We have lots of people at OpenAI that are making the switch to pytorch. Some of the ops have already been ported over. I think we should be able to just fully support both frameworks in the future. Relative attention, new convolution primitives, more learned sparsity support, fast product-key memory ops, among other things will be released soon. Priority now is to finish up our paper on learned sparsity and dump a lot of this code.

georgepar commented 5 years ago

Hi Scott, Great to hear that there is a plan to support tf 2.0 and pytorch.

Bidski commented 4 years ago

Is there any progress on this?

habibian commented 4 years ago

The same question: Is there any progress on this?

shizhediao commented 4 years ago

Hi Scott, Is there any progress on this?

lhl2017 commented 4 years ago

Hi, @georgepar , Have you solved this problem? If you do it, please give me some advice. Thanks.

georgepar commented 4 years ago

hi @lhl2017 unfortunately no. I ended up using other alternatives like the reformer. You can check out a recent implementation of block sparse in pytorch available though https://github.com/ptillet/torch-blocksparse

ibeltagy commented 4 years ago

You might also want to give Longformer a shot, especially if you are working on an NLP task as it includes a pretrained model for long docs https://github.com/allenai/longformer (self-promotion :D)

Bidski commented 4 years ago

I ended up using a sparsity constraint on the weights of my kernel (a custom tensorflow/keras constraint that just multiples the weights matrix with a sparse mask).

lhl2017 commented 4 years ago

@georgepar Thank you! I will try this version. Actually, I wish to use the official version of BlockSparse to reproduce Sparse transformer paper. In addition that I wanna compare to CuBLAS and CuSPARSE for checking results that they said.