enable option to disable pin_memory in pytorch

argonne-lcf / dlio_benchmark

An I/O benchmark for deep Learning applications

https://dlio-benchmark.readthedocs.io

Apache License 2.0

70 stars 30 forks source link

enable option to disable pin_memory in pytorch #239

Closed rayandrew closed 1 month ago

rayandrew commented 1 month ago

Hi @zhenghh04 and @hariharan-devarajan,

This PR gives us option to disable pin memory in pytorch dataloader. I encountered error while emulating stormer when we increase the prefetch factor since the memory is being reserved and cannot be released until the dataloader is finished. Also, pin_memory is beneficial for GPU memory pinning which we do not use at all for DLIO [1, 2]

I set the default to True for backward compatibility, let me know if I should put pin_memory=False as default in the configuration.

Reference: [1] https://pytorch.org/tutorials/intermediate/pinmem_nonblock.html [2] https://discuss.pytorch.org/t/when-to-set-pin-memory-to-true/19723/19

rayandrew commented 1 month ago

Along with this PR, just want to point out the prefetch factor below while having discussion with @hariharan-devarajan recently

# torch_data_loader.py
        if self._args.read_threads >= 1:
            prefetch_factor = math.ceil(self._args.prefetch_size / self._args.read_threads)
        else:
            prefetch_factor = self._args.prefetch_size

Here, we limit the prefetch factor based on number of threads. Is it intended? Probably it is better from user perspective to put everything as prefetch factor without making assumption that this number will be divided by number of workers.

hariharan-devarajan commented 1 month ago

Along with this PR, just want to point out the prefetch factor below while having discussion with @hariharan-devarajan recently
# torch_data_loader.py
        if self._args.read_threads >= 1:
            prefetch_factor = math.ceil(self._args.prefetch_size / self._args.read_threads)
        else:
            prefetch_factor = self._args.prefetch_size
Here, we limit the prefetch factor based on number of threads. Is it intended? Probably it is better from user perspective to put everything as prefetch factor without making assumption that this number will be divided by number of workers.

@zhenghh04 I agree with Ray that this is confusing for users of the benchmark. I would recommend to not do this.