-
Hi!Thanks for your great repo.
I tried the script in fairseq-RoBERTa/launch/FreeLB/rte-fp32-clip.sh and used the same setting as that in Issue #11 .
```
# run_exp GPU TOTAL_NUM_UPDATES …
-
Hi! Thanks for this repository.
I've been trying to reproduce the results from the paper but ran into some problems. I tried the script in `fairseq-RoBERTa/launch/FreeLB/rte-fp32-clip.sh` which I w…
-
In this [code](https://github.com/zhuchen03/FreeLB/blob/master/huggingface-transformers/examples/run_glue_freelb.py#L229), if adv_init_mag > 0, model will only be trained on adversarial examples?
I …
-
First thanks for your wonderful work.
Does anyone meet the Nan error during the training-end epoch?
I embedding FreeLB as a plugin format(without handle dropout_mask):
freelb.attack()
fr…
-
In fairseq implementation, the "update_freq" configuration (from the original fairseq code) specifies how often the optimizer updates model parameters. when update_freq > 1, it will accumulate gradien…
-
There are only 4 tasks' hyper parameters in this [file](https://github.com/zhuchen03/FreeLB/blob/master/huggingface-transformers/launch/run_glue.sh), would you please release others?
-
hi, when i used the example from `huggingface-transformers/examples/run_glue_freelb.py`
i met the error as this `'AlbertForSequenceClassification' object has no attribute 'encoder'`
it seems the cod…
-
Hey,
Great work on the SMART paper.
I have a very quick questions about numbers reported in Table 2 of the paper.
1) Is `RTE` model on the last row of table 2, initialized from `MNLI` checkpoin…