Closed deepakkupanda closed 2 years ago
Just google it and you would find out you are currently in the directory in which you don't have write permissions.
Thanks for your quick reply. I do have access to work_dirs. How can I change the location of the path where it is getting saved? Thanks!
Thanks for your quick reply. I do have access to work_dirs. How can I change the location of the path where it is getting saved? Thanks!
You can use --work-dir
.
I changed the work_dir but it was still giving the error.
I followed the comments from this issue Some environments do not support os.symlink so that you can add an argument in the checkpoint_cfg field in config files, like checkpoint_cfg=dict(create_symlink=False). I added the checkpoint_cfg in the schedule checkpoint_config = dict(by_epoch=False, interval=16000, create_symlink=False)
Traceback (most recent call last): File "/mnt/batch/tasks/shared/LS_root/mounts/clusters/deepakpanda9/code/Users/deepakpanda/segmentation/mmsegmentation/tools/train.py", line 240, in
main()
File "/mnt/batch/tasks/shared/LS_root/mounts/clusters/deepakpanda9/code/Users/deepakpanda/segmentation/mmsegmentation/tools/train.py", line 229, in main
train_segmentor(
File "/mnt/batch/tasks/shared/LS_root/mounts/clusters/deepakpanda9/code/Users/deepakpanda/segmentation/mmsegmentation/mmseg/apis/train.py", line 191, in train_segmentor
runner.run(data_loaders, cfg.workflow)
File "/anaconda/envs/open-mmlab/lib/python3.10/site-packages/mmcv/runner/iter_based_runner.py", line 134, in run
iter_runner(iter_loaders[i], *kwargs)
File "/anaconda/envs/open-mmlab/lib/python3.10/site-packages/mmcv/runner/iter_based_runner.py", line 67, in train
self.call_hook('after_train_iter')
File "/anaconda/envs/open-mmlab/lib/python3.10/site-packages/mmcv/runner/base_runner.py", line 309, in call_hook
getattr(hook, fn_name)(self)
File "/anaconda/envs/open-mmlab/lib/python3.10/site-packages/mmcv/runner/hooks/checkpoint.py", line 167, in after_train_iter
self._save_checkpoint(runner)
File "/anaconda/envs/open-mmlab/lib/python3.10/site-packages/mmcv/runner/dist_utils.py", line 129, in wrapper
return func(args, kwargs)
File "/anaconda/envs/open-mmlab/lib/python3.10/site-packages/mmcv/runner/hooks/checkpoint.py", line 121, in _save_checkpoint
runner.save_checkpoint(
File "/anaconda/envs/open-mmlab/lib/python3.10/site-packages/mmcv/runner/iter_based_runner.py", line 220, in save_checkpoint
mmcv.symlink(filename, dst_file)
File "/anaconda/envs/open-mmlab/lib/python3.10/site-packages/mmcv/utils/path.py", line 36, in symlink
os.symlink(src, dst, kwargs)
OSError: [Errno 95] Operation not supported: 'iter_16000.pth' -> '/mnt/batch/tasks/shared/LS_root/mounts/clusters/deepakpanda9/code/Users/deepakpanda/segmentation/mmsegmentation/work_dirs/upernet_beit-base_640x640_160k_ade20k/latest.pth'
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 27006) of binary: /anaconda/envs/open-mmlab/bin/python