Closed Kami-chanw closed 2 months ago
I solved this problem with following approach. (Perhaps nobody will meet the same problem as me)
use_distributed_sampler=False
in Trainer
when instantiate it.RandomSampler
, SequentialSamper
etc) with DistributedSampler
when trainer.world_size > 1
.
Outline & Motivation
I created a batch sampler of batches which is used to sample a larger batch. For example, assume we have a PyTorch
BatchSampler
which yields a batch ofbatch_size=3
. I can use my custom batch sampler whosebatch_size=5
to sample 5 times from underlying batch sampler to yield a large batch ofbatch_size=15
.However, in function
_dataloader_init_kwargs_resolve_sampler
, it tries to inject a normal sampler (only yield one batch) to my custom batch sampler.Pitch
I want to know what is the correct approach in my situation.
Additional context
No response
cc @justusschock @awaelchli