Open benihime91 opened 2 months ago
Hi , how can i integrate mosaicml streaming with huggingface-accelerate.
Normally with a stypical dataset and dataloader you would need to do
data_loader = accelerator.prepare(data_loader)
and internally i think accelerate is wrapping the loader under a DistributedSampler of sorts. Is this required when using mosaicml streaming dataset ? Or i can skip this step following this comment: https://github.com/mosaicml/streaming/issues/225#issuecomment-1510478052
My use case is for multi-gpu multi-node training.
Hey @benihime91, what was the solution here? We've had some folks ask about using hf accelerate so would be good to know so we can add to docs.
cc @XiaohanZhangCMU
Hi , how can i integrate mosaicml streaming with huggingface-accelerate.
Normally with a stypical dataset and dataloader you would need to do
and internally i think accelerate is wrapping the loader under a DistributedSampler of sorts. Is this required when using mosaicml streaming dataset ? Or i can skip this step following this comment: https://github.com/mosaicml/streaming/issues/225#issuecomment-1510478052
My use case is for multi-gpu multi-node training.