allenai / longformer

Longformer: The Long-Document Transformer
https://arxiv.org/abs/2004.05150
Apache License 2.0
2.05k stars 275 forks source link

Problem with Tensor Virtual Machine (TVM) #111

Open yeliu918 opened 4 years ago

yeliu918 commented 4 years ago

Hi,

Nice work! Very interesting and useful!

I didn't know the Tensor Virtual Machine (TVM) before. And I check the blog about TVM. https://tvm.apache.org/2020/07/14/bert-pytorch-tvm

From my understanding, I don't think the original design of TVM can only compute and store the non-zero value. So does it is your implementation that makes Longformer only compute and store the non-zero values? I want to try this idea on our own model. Could you give me more hint about how does your model achieve that?

Best, Ye

ibeltagy commented 4 years ago

TVM can be used in multiple different ways. The blog post you mentioned is about using TVM to compile an existing pytorch model into one big optimized binary. This can make the model a bit faster but it doesn't solve the O(n^2) problem of selfattention. As you said, it also doesn't take into account sparse tensors (we don't use sparse tensors anyways).

What we used is a lower-level TVM construct that lets you write your own cuda kernel, compile it into binaries, then call it as if it is a regular pytorch function. So yes, as you said, it is our implementation of the cuda kernel that makes it possible to only compute the non-zero values. Our code is similar to the 3 nested loops of regular matrix multiplication but only computes certain diagonals of the output tensor, then store them as columns in a tensor with some padding.

yeliu918 commented 4 years ago

Thanks for the clarification. Could you specify where is the code to accomplish the 3 nested loops of regular matrix multiplication? I guess it's in the tvm/libtvm_runtime.so?

ibeltagy commented 4 years ago

All our TVM code is here: https://github.com/allenai/longformer/blob/master/longformer/diagonaled_mm_tvm.py, and the nested loops are these lines https://github.com/allenai/longformer/blob/master/longformer/diagonaled_mm_tvm.py#L52-L82

The code under https://github.com/allenai/longformer/tree/master/tvm, which compiles into libtvm_runtime.so, is copied from the tvm library to load and run binaries.

yeliu918 commented 4 years ago

Thanks for the quick response! Very appreciate!