-
Hi, I'm training a CPC model with a customized dataset (rather than BookCorpus) and multiple GPUs. I tried the default learning rate of 2e-4 in main.py at first. However, the training loss (around 15.…
-
当我运行create_pretrain_feature.sh 中的如下一段时(Wikipedia only 的那一段),即:
```
python create_pretrain_feature.py --lowercase --vocab_path $VOCAB_PATH --wiki_dir $WIKI_DIR
```
会报这个错误:
```
[11 07:34:01] Nam…
-
Hi!
First of all, thanks for open-sourcing your code for the "Prune Once for All" paper.
I would like to ask a few questions:
1. Are you planning to release your teacher model for upstream task? I …
-
Hi,
Thank you for this very clean repository! I wonder whether any checkpoints will be released? If yes, I'm interested in adding those to the HuggingFace Transformers library (the Reformer and Lon…
-
I am trying to reproduce the codenser pretraining results. I evaluate the checkpoint on the sts-b task with sentence-transformer, but the results are different.
(1)bert-base-uncased
2022-01-03 17:07…
-
Hi,
I have encountered this problem during loading the openwebtext dataset:
```
>>> dataset = load_dataset('openwebtext')
Downloading and preparing dataset openwebtext/plain_text (download: 12.00 …
-
I want to use the pretraining task of ProphetNet, that recovers the mask span of the input sentence.
I follow the instruction of Figure 1 in the paper.
For example, the input is `But I [MASK][MASK…
-
## Environment info
- `transformers` version: 4.9.0
- Platform: Linux-5.8.0-63-generic-x86_64-with-debian-bullseye-sid
- Python version: 3.7.10
- PyTorch version (GPU?): 1.8.1+cu111 (True)
- …
-
Hello,
I wonder if there is a train-validation-test split or a train-test split only. I'm asking because during the dataset generation train-* and test-* files are generated, but no valid-* files. …
-
After some experiments with bookcorpus I noticed that querying examples from big datasets is slower than small datasets.
For example
```python
from datasets import load_dataset
b1 = load_dataset…