synxlin / deep-gradient-compression

[ICLR 2018] Deep Gradient Compression: Reducing the Communication Bandwidth for Distributed Training
https://arxiv.org/pdf/1712.01887.pdf
Apache License 2.0
212 stars 45 forks source link

pin memory error #1

Open yimjinkyu1 opened 3 years ago

yimjinkyu1 commented 3 years ago

Traceback (most recent call last): File "train.py", line 415, in main() File "train.py", line 191, in main loader=loaders['test'], split='test') File "train.py", line 315, in evaluate for step, (inputs, targets) in enumerate(loader): File "/home/sr6/jinkyu.yim/vir_python3.7/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 363, in next data = self._next_data() File "/home/sr6/jinkyu.yim/vir_python3.7/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 989, in _next_data return self._process_data(data) File "/home/sr6/jinkyu.yim/vir_python3.7/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1014, in _process_data data.reraise() File "/home/sr6/jinkyu.yim/vir_python3.7/lib/python3.7/site-packages/torch/_utils.py", line 395, in reraise raise self.exc_type(msg) RuntimeError: Caught RuntimeError in pin memory thread for device 0. Original Traceback (most recent call last): File "/home/sr6/jinkyu.yim/vir_python3.7/lib/python3.7/site-packages/torch/utils/data/_utils/pin_memory.py", line 31, in _pin_memory_loop data = pin_memory(data) File "/home/sr6/jinkyu.yim/vir_python3.7/lib/python3.7/site-packages/torch/utils/data/_utils/pin_memory.py", line 55, in pin_memory return [pin_memory(sample) for sample in data] File "/home/sr6/jinkyu.yim/vir_python3.7/lib/python3.7/site-packages/torch/utils/data/_utils/pin_memory.py", line 55, in return [pin_memory(sample) for sample in data] File "/home/sr6/jinkyu.yim/vir_python3.7/lib/python3.7/site-packages/torch/utils/data/_utils/pin_memory.py", line 47, in pin_memory return data.pin_memory() RuntimeError: Error in dlopen or dlsym: libcaffe2_nvrtc.so: cannot open shared object file: No such file or directory