allenai / longformer

Longformer: The Long-Document Transformer
https://arxiv.org/abs/2004.05150
Apache License 2.0
2.03k stars 270 forks source link

Instructions to compile the TVM CUDA kernel do not work #167

Open pzzhang opened 3 years ago

pzzhang commented 3 years ago

I tried to compile the TVM CUDA kernel on my own computer with Ubuntu16.04. I have docker and Docker gpu runtime installed and they work well for my other projects.

Following the instructions, I tried to build the docker image "my_tvm_image".


clone longformer

git clone https://github.com/allenai/longformer.git cd longformer

clone tvm inside the longformer directory

git clone --single-branch --branch v0.6.0 https://github.com/apache/incubator-tvm.git

build docker image

docker build -t my_tvm_image -f tvm_docker incubator-tvm/docker/


However, the image build failed with the following error: Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-build-vwbvld19/futures/ The command '/bin/sh -c pip3 install numpy pytest cython decorator scipy ipython ipdb torch==1.2.0 torchvision tensorboardx tensorboard pytest' returned a non-zero code: 1

Any suggestions on this? Or can the authors upload the built docker image to dockerhub so that others can use?

pzzhang commented 3 years ago

https://github.com/allenai/longformer/pull/168

It seems that that the tensorboard introduces some new dependencies and breaks the docker image building. I modified the tvm_docker file and successfully built the image. I tested the image by rebuilding the kernel and it works well.

I also propose a new way to compute the diag_mm when t1 is diagonaled and transposed, in the PR above. It avoids using the w_upper arguments. I tested it in the non_autoregressive mode and it works well (by checking the gradients). I did not test the autoregressive mode, but it should also work. This new implementation is useful in my case, because it avoids the w_upper. Could the authors help check whether the new implementation is correct?