JosephKJ / OWOD

(CVPR 2021 Oral) Open World Object Detection
https://josephkj.in
Apache License 2.0
1.04k stars 155 forks source link

KeyError: 'Non-existent config key: OWOD' #10

Closed ccblublu closed 3 years ago

ccblublu commented 3 years ago

first thank for your excellent job but an error occurred when i tried to reproduce your code:

Traceback (most recent call last): File "tools/train_net.py", line 172, in args=(args,), File "/media/chen/299D817A2D97AD94/detectron2/detectron2/engine/launch.py", line 62, in launch main_func(*args) File "tools/train_net.py", line 134, in main cfg = setup(args) File "tools/train_net.py", line 126, in setup cfg.merge_from_file(args.config_file) File "/media/chen/299D817A2D97AD94/detectron2/detectron2/config/config.py", line 55, in merge_from_file self.merge_from_other_cfg(loaded_cfg) File "/home/chen/.conda/envs/detectron/lib/python3.6/site-packages/fvcore/common/config.py", line 123, in merge_from_other_cfg return super().merge_from_other_cfg(cfg_other) File "/home/chen/.conda/envs/detectron/lib/python3.6/site-packages/yacs/config.py", line 217, in merge_from_other_cfg _merge_a_into_b(cfg_other, self, self, []) File "/home/chen/.conda/envs/detectron/lib/python3.6/site-packages/yacs/config.py", line 491, in _merge_a_into_b raise KeyError("Non-existent config key: {}".format(full_key))

KeyError: 'Non-existent config key: OWOD'

i think i have installed detectron2 successful look forward to your reply

deepaksinghcv commented 3 years ago

Even I'm facing the same issue while trying to train on iOD/all_20_train.yaml

ccblublu commented 3 years ago

@deepakksingh i guess it's because of the version of the referenced detectron2 package but i m not sure it seems connot build the right cfg from OWOD, are there any changes to the files in the detectorn2 folder and the installed detectron2 package? @JosephKJ

JosephKJ commented 3 years ago

Hi @ccblublu, @deepakksingh : There are many changes to files within detectorn2 folder. The whole repository should be used as is. It is not designed to work off-the-shelf with other detectron2 versions.

Further, can you share the exact command that you used?

deepaksinghcv commented 3 years ago

Hi @JosephKJ, I used the following command:

python tools/train_net.py --num-gpus=4 --config-file=./configs/OWOD/iOD/all_20_train.yaml
JosephKJ commented 3 years ago

And, have you build detectron 2 from this repo?

deepaksinghcv commented 3 years ago

I already have detectron2 built, so I did not build from this repo. Silly doubt, If I build from this repo, will my pre-built build be affected?

JosephKJ commented 3 years ago

@deepakksingh : That is the reason. You need to build detectron2 from this repo.

"If I build from this repo, will my pre-built build be affected?": Yes. Please use another python env to sandbox your projects.

deepaksinghcv commented 3 years ago

Thank you @JosephKJ. I will try that out.

deepaksinghcv commented 3 years ago

I created a new environment and built it using, (I was at OWOD directory which already has the setup.py)

python -m pip install -e ./

I had to install some packages, and I also set the DETECTRON2_DATASETS path, but when I run the following command,

python tools/train_net.py --num-gpus=4 --config-file=./configs/OWOD/iOD/all_20_train.yaml

It is not able to use the detectron2 apis to download the WEIGHTS and throws this.

AssertionError: Checkpoint detectron2://ImageNetPretrained/MSRA/R-50.pkl not found!
deepaksinghcv commented 3 years ago

I noticed that there is a Deprecation Warning:

** fvcore version of PathManager will be deprecated soon. **
** Please migrate to the version in iopath repo. **
https://github.com/facebookresearch/iopath 

Command Line Args: Namespace(config_file='./configs/OWOD/iOD/all_20_train.yaml', dist_url='tcp://127.0.0.1:50712', eval_only=False, machine_rank=0, num_gpus=4, num_machin
es=1, opts=[], resume=False)
** fvcore version of PathManager will be deprecated soon. **
** Please migrate to the version in iopath repo. **
https://github.com/facebookresearch/iopath 
JosephKJ commented 3 years ago

This is the fix: https://github.com/Majiker/BalancedMetaSoftmax-InstanceSeg/issues/3#issuecomment-778738124

LoveIsAGame commented 3 years ago

@deepakksingh I met the same problem as you. Have you solved it? if yes. Could you please tell me the detailed solution process?Thanks!

deepaksinghcv commented 3 years ago

Yes, I'm able to resolve the issue. I had to re-build detectron2, as mentioned in the INSTALL.md, The reliability package was missing and I had to downgrade fvcore as mentioned in the following issue.

This is the fix: Majiker/BalancedMetaSoftmax-InstanceSeg#3 (comment)

LoveIsAGame commented 3 years ago

@deepakksingh And I solved the problem!Thank you very much!

Hrqingqing commented 3 years ago

@deepakksingh And I solved the problem!Thank you very much!

Hello,I can‘t understand the meaning of re-build,in INSTALL.md,there are "Build Detectron2 from Source" and "Install Pre-Built Detectron2 (Linux only)",Which one should I follow. Looking forward to your reply.

shyam671 commented 3 years ago

I build the repo using the following steps:

Step1: Download the repo. Run the command: python -m pip install -e [Name of the folder]. (Ex- python -m pip install -e OWOD) Step2: Install the missing package [in my case: shortuuid, reliability].

Hope this helps!!

haleqiu commented 3 years ago

Hi @ccblublu, @deepakksingh : There are many changes to files within detectorn2 folder. The whole repository should be used as is. It is not designed to work off-the-shelf with other detectron2 versions.

Further, can you share the exact command that you used?

@JosephKJ sounds like the instruction of installation is out-of-date? Anyone update the instruction?

deepaksinghcv commented 3 years ago

@haleqiu I recently built it again. What issues are you facing?

Jiyang-Zheng commented 3 years ago

Hi, I am experiencing this issue after all instruction above.

Traceback (most recent call last):
  File "/home/.conda/envs/owod/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 59, in _wrap
    fn(i, *args)
  File "/scratch1/BW/Projects/OWOD/detectron2/engine/launch.py", line 94, in _distributed_worker
    main_func(*args)
  File "/scratch1/BW/Projects/OWOD/tools/train_net.py", line 132, in main
    cfg = setup(args)
  File "/scratch1/BW/Projects/OWOD/tools/train_net.py", line 127, in setup
    default_setup(cfg, args)
  File "/scratch1/zhe030/BW/Projects/OWOD/detectron2/engine/defaults.py", line 132, in default_setup
    logger.info("Environment info:\n" + collect_env_info())
  File "/scratch1/BW/Projects/OWOD/detectron2/utils/collect_env.py", line 136, in collect_env_info
    msg = " - invalid!" if not os.path.isdir(CUDA_HOME) else ""
  File "/home/.conda/envs/owod/lib/python3.6/genericpath.py", line 42, in isdir
    st = os.stat(s)
TypeError: stat: path should be string, bytes, os.PathLike or integer, not NoneType
deepaksinghcv commented 3 years ago

@Jiyang-Zheng, Could you check whether CUDA_HOME is set? Looking at the traceback it seems to be missing.

Jiyang-Zheng commented 3 years ago

Thanks @deepakksingh. image Seems like this is the reason. However, I'm using a cloud server, I don't think I could change any of these. Do you think there could be anyway to get around this issue?

deepaksinghcv commented 3 years ago

In detectron2, the collect_env_info is called to collect and dump environment info in the logs. You can try commenting 136 in collect_env.py But the issue may arise while executing. Are you sure your cloud server is having the cuda modules loaded/available? I do not have any experience with cloud servers. I would also suggest you check detectron2's official issues for the same.

Jiyang-Zheng commented 3 years ago

@deepakksingh Thanks dude. I have now solved the issue. Appreciate for your advice.

For anyone interested https://github.com/facebookresearch/detectron2/issues/2254 will do the fix.

Cheers