-
Here's the code I'm trying to run:
```python
dset_wikipedia = nlp.load_dataset("wikipedia", "20200501.en", split="train", cache_dir=args.cache_dir)
dset_wikipedia.drop(columns=["title"])
dset_wi…
-
Tried running the pretrain.py script when I got this error
```
process id: 76202
{'device': 'cuda:0', 'base_run_name': 'vanilla', 'seed': 11081, 'adam_bias_correction': False, 'schedule': 'origin…
-
Are there alternative links to download Wikipedia and BookCorpus datasets?. It seems to be a known issue for other teams:
https://github.com/mlperf/training/issues/377
https://github.com/NVIDIA/…
-
Hi there, I just would like to use this awesome lib to perform a dataset fine-tuning on RACE dataset. I have performed the following steps:
```
dataset = nlp.load_dataset("race")
len(dataset["tra…
-
I know the copyright/distribution of this one is complex, but it would be great to have! That, combined with the existing `wikitext`, would provide a complete dataset for pretraining models like BERT.
-
你好
BP-Transformer这篇论文非常棒,我很喜欢这篇论文
关于Text Classification类的任务,
请问你们有否做过和BERT或者ALBERT之间的比较试验?
-
Related to **Model/Framework(s)**
PyTorch/LanguageModeling/BERT
**Describe the bug**
BookCorpus no longer available from Smashwords.
**To Reproduce**
The following works perfectly.
```
g…
-
Hi, I am trying to create Toronto Book Corpus. #131
I ran
`~/nlp % python nlp-cli test datasets/bookcorpus --save_infos --all_configs`
but this doesn't create `dataset_info.json` and try to use …
-
Is English data ERNIE 2.0 was pretrained on available (Encyclopedia, BookCorpus, Reddit)? I'm particularly interested in building a frequency dictionary for the training data.
-
有性能测试的结果吗?
例如,BERT training在DGX-2上的性能,跟Nvidia版本之间的性能比较。
thanks
nbcsm updated
4 years ago