Hi,thanks for sharing for this nice work!
I try to train voc8.res50v3+.CPS with 2*V100 . Log as follows:
[32m03 00:47:17 [0mPyTorch Version 1.0.0, Furnace Version 0.1.1
continue_state_object: None
[32m03 00:47:17 [0mPyTorch Version 1.0.0, Furnace Version 0.1.1
continue_state_object: None
Warning: using Python fallback for SyncBatchNorm, possibly because apex was installed without --cuda_ext. The exception raised when attempting to import the cuda backend was: /home/ user/.conda/envs/semiseg/lib/python3.6/site-packages/apex-0.1-py3.6-linux-x86_64.egg/syncbn.cpython-36m-x86_64-linux-gnu.so: undefined symbol: __cudaRegisterFatBinaryEnd
[32m03 00:47:26 [0mLoad model, Time usage:
IO: 0.0604250431060791, initialize parameters: 0.03919053077697754
[32m03 00:47:26 [0mLoad model, Time usage:
IO: 0.0598149299621582, initialize parameters: 0.04204106330871582
Warning: using Python fallback for SyncBatchNorm, possibly because apex was installed without --cuda_ext. The exception raised when attempting to import the cuda backend was: /home/ user/.conda/envs/semiseg/lib/python3.6/site-packages/apex-0.1-py3.6-linux-x86_64.egg/syncbn.cpython-36m-x86_64-linux-gnu.so: undefined symbol: __cudaRegisterFatBinaryEnd
distributed !!
[32m03 00:47:27 [0mLoad model, Time usage:
IO: 0.05859684944152832, initialize parameters: 0.04361605644226074
[32m03 00:47:27 [0mLoad model, Time usage:
IO: 0.05780911445617676, initialize parameters: 0.03970694541931152
distributed !!
[32m03 00:47:46 [0musing devices 0, 1
Hi,thanks for sharing for this nice work! I try to train voc8.res50v3+.CPS with 2*V100 . Log as follows:
[32m03 00:47:17 [0mPyTorch Version 1.0.0, Furnace Version 0.1.1 continue_state_object: None [32m03 00:47:17 [0mPyTorch Version 1.0.0, Furnace Version 0.1.1 continue_state_object: None Warning: using Python fallback for SyncBatchNorm, possibly because apex was installed without --cuda_ext. The exception raised when attempting to import the cuda backend was: /home/ user/.conda/envs/semiseg/lib/python3.6/site-packages/apex-0.1-py3.6-linux-x86_64.egg/syncbn.cpython-36m-x86_64-linux-gnu.so: undefined symbol: __cudaRegisterFatBinaryEnd [32m03 00:47:26 [0mLoad model, Time usage: IO: 0.0604250431060791, initialize parameters: 0.03919053077697754 [32m03 00:47:26 [0mLoad model, Time usage: IO: 0.0598149299621582, initialize parameters: 0.04204106330871582 Warning: using Python fallback for SyncBatchNorm, possibly because apex was installed without --cuda_ext. The exception raised when attempting to import the cuda backend was: /home/ user/.conda/envs/semiseg/lib/python3.6/site-packages/apex-0.1-py3.6-linux-x86_64.egg/syncbn.cpython-36m-x86_64-linux-gnu.so: undefined symbol: __cudaRegisterFatBinaryEnd distributed !! [32m03 00:47:27 [0mLoad model, Time usage: IO: 0.05859684944152832, initialize parameters: 0.04361605644226074 [32m03 00:47:27 [0mLoad model, Time usage: IO: 0.05780911445617676, initialize parameters: 0.03970694541931152 distributed !! [32m03 00:47:46 [0musing devices 0, 1
Any idea what is wrong here?