NVIDIA / DALI

A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.
https://docs.nvidia.com/deeplearning/dali/user-guide/docs/index.html
Apache License 2.0
5.08k stars 615 forks source link

Can DALI be integrated into HuggingFace Trainer? #5421

Open ShyFoo opened 5 months ago

ShyFoo commented 5 months ago

Describe the question.

Is it possible to use DALI as the dataloader for HuggingFace Trainer?

Check for duplicates

JanuszL commented 5 months ago

Hi @ShyFoo,

Thank you for reaching out. While we haven't tried this directly yet DALI provides compatibility with PyTorch so you can create a DALI iterator that would behave as PyTorch one. Have you tried this yet? Have you encountered any issues?

JanuszL commented 5 months ago

I just checked and unfortunately, HuggingFace trainer (I checked only one example so far) expects PyTorch DataIterator (checks for the type of passes object). I still believe it is possible to make DALI work there but I may need a bit of adjustments to the internal methods. If you have some spare time we would be more than happy to review a PR with such an example.

ShyFoo commented 5 months ago

I just checked and unfortunately, HuggingFace trainer (I checked only one example so far) expects PyTorch DataIterator (checks for the type of passes object). I still believe it is possible to make DALI work there but I may need a bit of adjustments to the internal methods. If you have some spare time we would be more than happy to review a PR with such an example.

Thanks for your reply. Yeah, I have tried integrating DALI into the HuggingFace trainer. As you mentioned, it seems to expect either torch.utils.data.Dataset or torch.utils.data.IterableDataset as input. It might be possible to customize a data collator, which can preprocess data in a DALI pipeline, for the HuggingFace trainer, but I'm not sure if it will work. If you have any idea for this, please let me know. I'm sure it would be a huge step for the whole community!

JanuszL commented 5 months ago

I did some basic investigation:

ShyFoo commented 5 months ago

For the third point, _get_traindataloader returns _self.accelerator.prepare(DataLoader(train_dataset, **dataloaderparams)) for a distributed training or evaluation. So what should I do in this setting? Hope for your help.

JanuszL commented 5 months ago

@ShyFoo - I'm afraid I don't know the answer. My idea was to avoid wrapping DALI into DataLoader as it serves as one already. If you have time I would be more than happy to learn your findings.

ShyFoo commented 5 months ago

e basic investigation:

Got it. I have tried using DALI + DeepSpeed, where I have to define a triaing loop and other practical functions myself, instead of using the Huggingface Trainer for convenience, so I prefer to use DALI, DeepSpeed, and Huggingface Trainer simultaneously if possible.

JanuszL commented 5 months ago

If you have any examples of DALI + DeepSpeed feel free to post them as PR to DALI - we would be more than happy to enrich our documentation and the base of samples.

ShyFoo commented 5 months ago

Well, I'd be happy to.