-
**Describe the bug**
Downloading bookcorpus via the [repo mentioned in BERT instruction]((https://github.com/soskek/bookcorpus/blob/master/README.md) hit an error: HTTPError: ```HTTP Error 503: Servi…
-
Hello DeepSpeed team, I am trying to reproduce BERT training with NVIDIA dataset, as I couldn't get the original dataset of Microsoft.
And when I checked the batch size to scale efficiently during pr…
-
Hi @vgaraujov. I'm sorry to bother you again. The training really takes a lot of time, especially since I can only use 1 GPU now. I'm wondering if you can share any checkpoints that you trained on the…
-
## ❓ Questions and Help
#### What is your question?
1) Does Bart offer base-size(6-layer encoder, 6-layer decoder, hidden size 768) pre-trained models? Since in the summarization task, the baselin…
-
微博内容精选
-
## Describe the bug
Cannot load 'bookcorpusopen'
## Steps to reproduce the bug
```python
dataset = load_dataset('bookcorpusopen')
```
or
```python
dataset = load_dataset('bookcorpusopen',s…
-
Hi, I'm training a CPC model with a customized dataset (rather than BookCorpus) and multiple GPUs. I tried the default learning rate of 2e-4 in main.py at first. However, the training loss (around 15.…
-
当我运行create_pretrain_feature.sh 中的如下一段时(Wikipedia only 的那一段),即:
```
python create_pretrain_feature.py --lowercase --vocab_path $VOCAB_PATH --wiki_dir $WIKI_DIR
```
会报这个错误:
```
[11 07:34:01] Nam…
-
Hi,
Thank you for this very clean repository! I wonder whether any checkpoints will be released? If yes, I'm interested in adding those to the HuggingFace Transformers library (the Reformer and Lon…
-
Hi!
First of all, thanks for open-sourcing your code for the "Prune Once for All" paper.
I would like to ask a few questions:
1. Are you planning to release your teacher model for upstream task? I …