-
Hello, I would like to know if you have published the code to project the pre-trained weights of the BERT model into Monarch matrices. I cannot locate the code for this (I have also looked in the fly …
-
### Model description
**Restormer: Efficient Transformer for High-Resolution Image Restoration** was published in CVPR 2022, which introduced a new Vision Transformer based architecture for Image Res…
-
### Your GTNH Discord Username
@droideka30
### Your Pack Version
2.5.1
### Your Proposal
If you decide to get into railroading after making an assembler, it's really annoying that the q…
-
Provided code calculates matrix product of q and k.
https://github.com/YuchuanTian/DiJiang/blob/main/modeling/pythia-2.8B-dijiang/modeling_gpt_neox_dijiang.py#L286
That means it has computational …
-
### Branch/Tag/Commit
main
### Docker Image Version
pytorch-22.08.py3
### GPU name
V100
### CUDA Driver
515.65
### Reproduced Steps
```shell
MT5 need gelu_new op,but FasterTransformer doesn't…
-
Hi,
For imagenet, you mentioned in the paper the Hyena code is used for the experimentation by replacing MLP blocks in Hyena ViT-b with block-diagonal matrices, similarly to M2-BERT. Based on the …
-
Hi, Jiahui! After an one-week training on a GTX 1080TI, I found some interesting results from my own dataset.
There are 2 kinds of images in my dataset. One is the images with clearly texture like th…
-
first of all, thx for implementation!
the question is about proper masking inside the model
1. shift_down and shift right in the beginning of PixelSnail module have already taken care of masking…
-
Error occurred when executing MediaPipe-FaceMeshPreprocessor:
Failed to parse: node {
calculator: "ImagePropertiesCalculator"
input_stream: "IMAGE:image"
output_stream: "SIZE:image_size"
}
nod…
-