Hi,
I have multiple large-scale datasets in TFDS format, which needs to be converted to iterative datasets, and I want to trani large-scale T5 model on TPUs with them, for this I need a distributed dataloader which can handle iterative datasets efficiently with pytorch XLA. Here is example when datasets are not iterative:
I appreciate providing me with examples of how I can implement handling large-scale TFDS datasets and distributed dataloader to be able to train models with your library.
Hi, I have multiple large-scale datasets in TFDS format, which needs to be converted to iterative datasets, and I want to trani large-scale T5 model on TPUs with them, for this I need a distributed dataloader which can handle iterative datasets efficiently with pytorch XLA. Here is example when datasets are not iterative:
return DistributedSampler(dataset, num_replicas=xm.xrt_world_size(), rank=xm.get_ordinal())
I appreciate providing me with examples of how I can implement handling large-scale TFDS datasets and distributed dataloader to be able to train models with your library.
thanks. Best Rabeeh