Closed ccblublu closed 3 years ago
Even I'm facing the same issue while trying to train on iOD/all_20_train.yaml
@deepakksingh i guess it's because of the version of the referenced detectron2 package but i m not sure it seems connot build the right cfg from OWOD, are there any changes to the files in the detectorn2 folder and the installed detectron2 package? @JosephKJ
Hi @ccblublu, @deepakksingh : There are many changes to files within detectorn2 folder. The whole repository should be used as is. It is not designed to work off-the-shelf with other detectron2 versions.
Further, can you share the exact command that you used?
Hi @JosephKJ, I used the following command:
python tools/train_net.py --num-gpus=4 --config-file=./configs/OWOD/iOD/all_20_train.yaml
And, have you build detectron 2 from this repo?
I already have detectron2 built, so I did not build from this repo. Silly doubt, If I build from this repo, will my pre-built build be affected?
@deepakksingh : That is the reason. You need to build detectron2 from this repo.
"If I build from this repo, will my pre-built build be affected?": Yes. Please use another python env to sandbox your projects.
Thank you @JosephKJ. I will try that out.
I created a new environment and built it using, (I was at OWOD directory which already has the setup.py)
python -m pip install -e ./
I had to install some packages, and I also set the DETECTRON2_DATASETS path, but when I run the following command,
python tools/train_net.py --num-gpus=4 --config-file=./configs/OWOD/iOD/all_20_train.yaml
It is not able to use the detectron2 apis to download the WEIGHTS and throws this.
AssertionError: Checkpoint detectron2://ImageNetPretrained/MSRA/R-50.pkl not found!
I noticed that there is a Deprecation Warning:
** fvcore version of PathManager will be deprecated soon. **
** Please migrate to the version in iopath repo. **
https://github.com/facebookresearch/iopath
Command Line Args: Namespace(config_file='./configs/OWOD/iOD/all_20_train.yaml', dist_url='tcp://127.0.0.1:50712', eval_only=False, machine_rank=0, num_gpus=4, num_machin
es=1, opts=[], resume=False)
** fvcore version of PathManager will be deprecated soon. **
** Please migrate to the version in iopath repo. **
https://github.com/facebookresearch/iopath
@deepakksingh I met the same problem as you. Have you solved it? if yes. Could you please tell me the detailed solution process?Thanks!
Yes, I'm able to resolve the issue.
I had to re-build detectron2, as mentioned in the INSTALL.md, The reliability
package was missing and I had to downgrade fvcore
as mentioned in the following issue.
This is the fix: Majiker/BalancedMetaSoftmax-InstanceSeg#3 (comment)
@deepakksingh And I solved the problem!Thank you very much!
@deepakksingh And I solved the problem!Thank you very much!
Hello,I can‘t understand the meaning of re-build,in INSTALL.md,there are "Build Detectron2 from Source" and "Install Pre-Built Detectron2 (Linux only)",Which one should I follow. Looking forward to your reply.
I build the repo using the following steps:
Step1: Download the repo. Run the command: python -m pip install -e [Name of the folder]. (Ex- python -m pip install -e OWOD) Step2: Install the missing package [in my case: shortuuid, reliability].
Hope this helps!!
Hi @ccblublu, @deepakksingh : There are many changes to files within detectorn2 folder. The whole repository should be used as is. It is not designed to work off-the-shelf with other detectron2 versions.
Further, can you share the exact command that you used?
@JosephKJ sounds like the instruction of installation is out-of-date? Anyone update the instruction?
@haleqiu I recently built it again. What issues are you facing?
Hi, I am experiencing this issue after all instruction above.
Traceback (most recent call last):
File "/home/.conda/envs/owod/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 59, in _wrap
fn(i, *args)
File "/scratch1/BW/Projects/OWOD/detectron2/engine/launch.py", line 94, in _distributed_worker
main_func(*args)
File "/scratch1/BW/Projects/OWOD/tools/train_net.py", line 132, in main
cfg = setup(args)
File "/scratch1/BW/Projects/OWOD/tools/train_net.py", line 127, in setup
default_setup(cfg, args)
File "/scratch1/zhe030/BW/Projects/OWOD/detectron2/engine/defaults.py", line 132, in default_setup
logger.info("Environment info:\n" + collect_env_info())
File "/scratch1/BW/Projects/OWOD/detectron2/utils/collect_env.py", line 136, in collect_env_info
msg = " - invalid!" if not os.path.isdir(CUDA_HOME) else ""
File "/home/.conda/envs/owod/lib/python3.6/genericpath.py", line 42, in isdir
st = os.stat(s)
TypeError: stat: path should be string, bytes, os.PathLike or integer, not NoneType
@Jiyang-Zheng, Could you check whether CUDA_HOME is set? Looking at the traceback it seems to be missing.
Thanks @deepakksingh. Seems like this is the reason. However, I'm using a cloud server, I don't think I could change any of these. Do you think there could be anyway to get around this issue?
In detectron2, the collect_env_info is called to collect and dump environment info in the logs. You can try commenting 136 in collect_env.py But the issue may arise while executing. Are you sure your cloud server is having the cuda modules loaded/available? I do not have any experience with cloud servers. I would also suggest you check detectron2's official issues for the same.
@deepakksingh Thanks dude. I have now solved the issue. Appreciate for your advice.
For anyone interested https://github.com/facebookresearch/detectron2/issues/2254 will do the fix.
Cheers
first thank for your excellent job but an error occurred when i tried to reproduce your code:
Traceback (most recent call last): File "tools/train_net.py", line 172, in
args=(args,),
File "/media/chen/299D817A2D97AD94/detectron2/detectron2/engine/launch.py", line 62, in launch
main_func(*args)
File "tools/train_net.py", line 134, in main
cfg = setup(args)
File "tools/train_net.py", line 126, in setup
cfg.merge_from_file(args.config_file)
File "/media/chen/299D817A2D97AD94/detectron2/detectron2/config/config.py", line 55, in merge_from_file
self.merge_from_other_cfg(loaded_cfg)
File "/home/chen/.conda/envs/detectron/lib/python3.6/site-packages/fvcore/common/config.py", line 123, in merge_from_other_cfg
return super().merge_from_other_cfg(cfg_other)
File "/home/chen/.conda/envs/detectron/lib/python3.6/site-packages/yacs/config.py", line 217, in merge_from_other_cfg
_merge_a_into_b(cfg_other, self, self, [])
File "/home/chen/.conda/envs/detectron/lib/python3.6/site-packages/yacs/config.py", line 491, in _merge_a_into_b
raise KeyError("Non-existent config key: {}".format(full_key))
KeyError: 'Non-existent config key: OWOD'
i think i have installed detectron2 successful look forward to your reply