-
### 🚀 The feature, motivation and pitch
I am unable to find the clean implementation of local multi-headed self-attention in pytorch geometric. I found three types of multi-head attention, one Transf…
-
Hello:)
Thank you so much for sharing your code. It has been very useful in understanding the paper.
There is still something I don't quite get from the paper and the code. It From my understandin…
-
why doesn't the architecture need position embedding?
-
Transformerのアルゴリズムについて調べ、その内容をまとめる。
以下は参考文献
- [Transfrmerの論文](https://arxiv.org/pdf/1706.03762)
- [論文の翻訳](https://hiroyukichishiro.com/attention-is-all-you-need/)
- [アルゴリズムの解説①](https://qiita.…
-
### Branch/Tag/Commit
main
### Docker Image Version
nvcr.io/nvidia/pytorch:21.04-py3
### GPU name
3090
### CUDA Driver
525.89.02
### Reproduced Steps
```shell
Bert Model with self defined att…
-
### Issue Type
Documentation Bug
### Source
source
### Keras Version
2.14
### Custom Code
Yes
### OS Platform and Distribution
Ubuntu 22.04
### Python version
3.10
…
-
~~I am trying to run the test_op pytest on the fused attention tutorial (https://triton-lang.org/master/getting-started/tutorials/06-fused-attention.html) on a A100 with CUDA 11.4. The error is:~~
…
-
When running ```sh scripts/run_text2video.sh```, an error occurred.
```
[rank:0] batch-1 (1)x1 ...
Traceback (most recent call last):
File "/media/mil/cc-code/VADER/VideoCrafter/scripts/evalua…
-
Hi,I am currently working on the model you have described. While reviewing the related documentation, I have encountered some questions regarding StageTwo, "Multi-View Knowledge Integration." Specific…
-
### 🚀 The feature, motivation and pitch
In the original implementation of the GPSLayer (found in [graphgps/layer/gps_layer.py](https://github.com/rampasek/GraphGPS/blob/main/graphgps/layer/gps_layer.…