torchtext.data.Iterator loading data too slow

pytorch / text

Models, data loaders and abstractions for language processing, powered by PyTorch

https://pytorch.org/text

BSD 3-Clause "New" or "Revised" License

3.51k stars 810 forks source link

torchtext.data.Iterator loading data too slow #668

Closed zml24 closed 2 years ago

zml24 commented 4 years ago

❓ Questions and Help

Description Hey Dude, I am running a machine translation task with torchtext.datasets.IWSLT. However, I found that the CPU usage is always high and GPU usage is always low. When using torch.utils.data.dataloader, I can use num_workers and pin_memory to deal with this problem. But I can't deal with this problem with torchtext.data.Iterator or torchtext.data.BucketItertor.

How can I accelerate my data loading speed via torchtext?

zhangguanheng66 commented 4 years ago

Yes, we will re-write the translation datasets later so they will be compatible with torch.utils.data.dataloader (see discussions https://github.com/pytorch/text/issues/664)

lizitong67 commented 3 years ago

How to solve this problem？？

Nayef211 commented 2 years ago

Closing this since torchtext.data.Iterator has been deleted and is no longer available in the latest releases of our library.