JonasGeiping / cramming

Cramming the training of a (BERT-type) language model into limited compute.
MIT License
1.29k stars 100 forks source link

initialize prefetch_factor to None when num_workers is zero #23

Closed itay-nakash closed 1 year ago

itay-nakash commented 1 year ago

Hi, It seems like the 'prefetch_factor' is initialized to 2 when num_of_workers==0, (from cramming/cramming/backend/utils, in 'prepare_downstream_dataloader):

dataloader = DataLoader(
        dataset,
        batch_size=cfg_impl.microbatch_size,
        sampler=sampler,
        num_workers=num_workers,
        pin_memory=cfg_impl.pin_memory,
        drop_last=True if mode == "training" else False,
        prefetch_factor=cfg_impl.prefetch_factor if num_workers > 0 else 2,
        persistent_workers=False,
        collate_fn=collate_fn,
    )
    return dataloader

However, in the new pytorch versions, you cannot initialize prefetch_factor to anything that isn't None if num_workers==0. [link to a relevant issue ]

the code: if persistent_workers and num_workers == 0: raise ValueError('persistent_workers option needs num_workers > 0') ,

from here, in pytorch code

Therefore, I changed it to initialize prefetch_factor to None if num_workers<=0.

I added those changes in case you think its relevant to the project,

Thanks

JonasGeiping commented 1 year ago

Thanks!