Closed yaroslavvb closed 6 years ago
Is dist-url:file not meant for multiple instances? @bearpelican
'--dist-url', 'file:///home/ubuntu/data/file.sync', # single instances are faster with file sync
#0 0x00007fd720987c1d in nanosleep () at ../sysdeps/unix/syscall-template.S:84 #1 0x00007fd70074e3f0 in thd::init::initFile(std::string, int, std::string, int) () from /home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so #2 0x00007fd70074c0d7 in std::_Function_handler<thd::InitMethod::Config (std::string, int, std::string, int), thd::InitMethod::Config (*)(std::string, int, std::string, int)>::_M_invoke(std::_Any_data const&, std::string, int, std::string, int) () from /home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so #3 0x00007fd70074b781 in thd::getInitConfig(std::string, int, std::string, int) () from /home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/_C.cpython-
Good question. Not sure if I tried file-sync with distributed machines. I probably just assumed it'd be slower syncing with EFS
Is dist-url:file not meant for multiple instances? @bearpelican
'--dist-url', 'file:///home/ubuntu/data/file.sync', # single instances are faster with file sync