-
My server cannot connect to the Hugging Face website, so I manually downloaded the pretrained model used in the code and placed it in the `img2img-turbo-main` folder. After executing the command `pyth…
-
Hi,
I was wondering how the class token is supposed to be handled in the reversible design? Since, replicating the token across the two residual paths is perhaps not optimal.
Any thoughts/poin…
-
When I tried to run the training, it raises an error for missing `model_type` key in config.json.
The training script I used is the one in the `run-ccp.sh`:
``` bash
outdir=runs/CCP
model_cfg=da…
-
![image](https://user-images.githubusercontent.com/77482/50403031-4bae4100-0750-11e9-9d30-fe1a9985e6eb.png)
https://en.wikipedia.org/wiki/Process_flow_diagram#/media/File:RefineryFlow.png
-
Hi Takafumi,
## Description of the problem you have
Thank you for this wonderful package. I have used it locally with great success. I was also able to use it on a remote machine without an x-windo…
-
Hey Hao-Wen,
I see you finally decided to scale up?!? :) This is very nice and long overdue. :)
First of all, I wanted to share with you my large-scale model/implementation for piano. Check it o…
-
I did a few training runs of a simple Reformer module with different parameters and logged the GPU memory usage.
Of course, depending on your machine or other things these values can vary, but I t…
-
I have a long piece of text where the end depends on the start. Should I pass the whole text in one batch for training? Also, does this affect generating the same thing?
-
Réordonner le menu entretien:
- ~~Entretien individuel~~
- ~~Entretien téléphonique~~
- Entretien / Rendez-vous avec les parents
Entretien avec les parents:
Nouveau type de fait:
- Date + heure
- Ra…
-
### 論文へのリンク
[[arXiv:2002.11794] Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers](https://arxiv.org/abs/2002.11794)
### 著者・所属機関
Zhuohan L…