-
# ❓ Questions & Help
## Details
Hey, I want to load the cnn-dailymail dataset for fine-tune.
I write the code like this
from datasets import load_dataset
test_dataset = load_dataset(“cn…
AI678 updated
3 years ago
-
## Environment info
- `transformers` version: 4.7.0.dev0
- Platform: Windows-10-10.0.19041-SP0
- Python version: 3.8.0
- PyTorch version (GPU?): 1.8.1 (True)
- Tensorflow version (GPU?): not inst…
-
- AI News
- EMNLP 2021 - Rebuttal 종료: 수고 많으셨습니다. 모두들 Good luck!
- NeurIPS 2021 - Review 종료
- NVidia Jetson developer meetup (21. 7. 22)
- Google ML 부트캠프 모집 시작 (~ 8.2): https://events.withg…
-
## ❓ Questions and Help
#### What is your question?
Hi ,
On what corpus is the [publicly available BART-base](https://huggingface.co/facebook/bart-base) pre-trained on?
It was not explicitly …
-
Any one know where to get them?
Thank you and thank you.
-
```python
# benchmark_filter.py
import logging
import sys
import time
from datasets import load_dataset, set_caching_enabled
if __name__ == "__main__":
set_caching_enabled(False)
…
-
# 🚀 Feature request
Hello, I am trying to pretrain from scratch a custom model on bookcorpus + wikipedia + openwebtext but I only have a 1TB disk. I tried to merge 20% of each one and then reload t…
-
I've been very excited about this amazing datasets project. However, I've noticed that the performance can be substantially slower than using an in-memory dataset.
Now, this is expected I guess, du…
-
**Describe the bug**
I run the tutorial on https://pytorch.org/hub/nvidia_deeplearningexamples_waveglow/
and I got errors
`AttributeError: 'Tacotron2' object has no attribute 'text_to_sequence`
…
-
I computed the sentence embedding of each sentence of bookcorpus data using bert base and saved them to disk. I used 20M sentences and the obtained arrow file is about 59GB while the original text fil…