-
I was wondering if you ever encountered nan-gradients during admin training.
I'm in torch 1.6/CUDA 10.1 with no modifications to the code:
#### Command
```bash
export dd=data-bin/wmt14_en_de_joi…
-
## ❓ Questions and Help
#### What is your question?
When i pip fairseq in build dockerfile.Raise error.
dockerfile
FROM pytorch/pytorch:1.3-cuda10.1-cudnn7-devel
RUN pip install redis fairse…
-
## 🐛 Bug
I followed up almost literally the example provided in https://github.com/pytorch/fairseq/blob/master/examples/backtranslation/README.md, but when I run fairseq-train I get
"Exception: p…
-
@XuezheMax
Hello, I replaced Adam with Apollo in the machine translation based on the transformer structure of the fairseq framework, but the effect decreased. I have a partner who does reading comp…
-
Hi, As we expect, the model with more transformer layers is easier to diverge during training. However, we find that the model with 12 encoder layers and 12 decoder layers is trained ok, but the model…
-
I was able to install everything as per your setup instructions. I run the training script `bash scripts/stack-transformer/experiment.sh configs/amr2_o5+Word100_roberta.large.top24_stnp6x6.sh`
The da…
-
When I use the training script `train.sh`, the following error is thrown -
```
+ nvidia-smi
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest …
-
### Bug Description
Probe constantly fails whether it is homing Z or leveling. Occasionally when it actually starts leveling it will "skip" a spot, immediately moving upwards and going to a new s…
-
Hi,
In [this line](https://github.com/LiyuanLucasLiu/Transformer-Clinic/blob/master/fairseq/fairseq/modules/transformer_layer.py#L178), the variable `tmp_weight` is not defined. How should it be s…
-
When I run speech to text on must c follows the instruciton, the error
"FileNotFoundError: Dict not found: /zjw/testproject/data/mustc/en-fr/dict.txt"
Occurs.
I followed the preprocess(st) an…