Lightning-AI / pytorch-lightning

Pretrain, finetune ANY AI model of ANY size on multiple GPUs, TPUs with zero code changes.
https://lightning.ai
Apache License 2.0
28.3k stars 3.38k forks source link

"Streaming Datasets" link is broken on "COMPLEX DATA USES" doc page #17363

Closed lensonic closed 1 year ago

lensonic commented 1 year ago

📚 Documentation

https://lightning.ai/docs/pytorch/stable/data/data.html, "Streaming Datasets" links to https://lightning.ai/docs/pytorch/stable/data/streaming.html which doesn't exist.

cc @borda

awaelchli commented 1 year ago

@lensonic Are you interested to send a PR with the updated link(s)?

cdreetz commented 1 year ago

@awaelchli does PyTorch Lightning no longer use its own streaming library? As the original issue states the lightning streaming link no longer works and according to the link below streaming is available but with PyTorch or Mosaic libraries. I would send a PR with updated links except there appears to be no working Lightning Documentation links on streaming that work.

https://lightning.ai/docs/pytorch/stable/data/alternatives.html Mosaic Streaming https://github.com/mosaicml/streaming PyTorch Iterable Dataloader for Streaming https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader

hebbaraditya commented 1 year ago

I would like to work on this issue and close the same; please assign

VictorPrins commented 1 year ago

The "Streaming datasets" link mentioned in this issue appears to have been removed from the docs in version 2.0.3. https://lightning.ai/docs/pytorch/2.0.2/data/data.html vs https://lightning.ai/docs/pytorch/2.0.3/data/data.html. So I guess the issue could be closed as it isn't as relevant anymore. cc @Borda

awaelchli commented 1 year ago

Corect, that's because the Streaming Dataset docs is a section in the Faster DataLoaders link right next to it.

Thanks for checking @VictorPrins