CanPeng123 / Faster-ILOD

45 stars 7 forks source link

KeyError: 'Non-existent config key: MODEL.ROI_BOX_HEAD.NAME_OLD_CLASSES' #2

Closed deepaksinghcv closed 3 years ago

deepaksinghcv commented 3 years ago

I installed it as per the instructions provided in INSTALL.md When I execute the following:

python tools/train_first_step.py --config-file=./configs/e2e_faster_rcnn_R_50_C4_1x_Source_model.yaml

my yacs version is the latest 0.1.8 Am I suppose to downgrade it? If yes, kindly specify the version.

deepaksinghcv commented 3 years ago

I tried downgrading from 0.1.7 to 0.1.4 by rebuilding maskrcnn-benchmark. The issue still persists. Any help will be much appreciated.

CanPeng123 commented 3 years ago

Hi

It seems like it is not the yacs version that gives the problem, since your error shows cannot find "MODEL.ROI_BOX_HEAD.NAME_OLD_CLASSES".

Maybe you can try to check the Faster-ILOD/maskrcnn_benchmark/config/defaults.py file. All the default config settings are written there. If you are using the original maskrcnn_benchmark config file, please change it to the one I uploaded.

In addition, to run the first step of the incremental learning (normal training), please try: python tools/train_first_step.py --config-file=./configs/e2e_faster_rcnn_R_50_C4_1x.yaml

modify the following setting on the config file: NUM_CLASSES, NAME_OLD_CLASSES, NAME_NEW_CLASSES, NAME_EXCLUDED_CLASSES

Hope this could help you.

deepaksinghcv commented 3 years ago

Hello @CanPeng123, Thank you for the quick response. I checked the Faster-ILOD/maskrcnn_benchmark/config/defaults.py to verify whether the hierarchy and default key values and types are matching. It is matching as per the config files provided in configs/. I'm not using the original maskrcnn_benchmark config file, I'm only using the configs provided in this repo under configs/ I have modified the dataset's path to the appropriate location. I noticed that you updated the file: configs/e2e_faster_rcnn_R_50_C4_1x.yaml. I executed the following command: python tools/train_first_step.py --config-file ./configs/e2e_faster_rcnn_R_50_C4_1x.yaml. It throws the following error:

Traceback (most recent call last):
  File "tools/train_first_step.py", line 231, in <module>
    main()
  File "tools/train_first_step.py", line 199, in main
    cfg.merge_from_file(args.config_file)
  File "/home/dksingh/anaconda3/envs/maskrcnn_benchmark/lib/python3.7/site-packages/yacs/config.py", line 213, in merge_from_file
    self.merge_from_other_cfg(cfg)
  File "/home/dksingh/anaconda3/envs/maskrcnn_benchmark/lib/python3.7/site-packages/yacs/config.py", line 217, in merge_from_other_cfg
    _merge_a_into_b(cfg_other, self, self, [])
  File "/home/dksingh/anaconda3/envs/maskrcnn_benchmark/lib/python3.7/site-packages/yacs/config.py", line 478, in _merge_a_into_b
    _merge_a_into_b(v, b[k], root, key_list + [k])
  File "/home/dksingh/anaconda3/envs/maskrcnn_benchmark/lib/python3.7/site-packages/yacs/config.py", line 478, in _merge_a_into_b
    _merge_a_into_b(v, b[k], root, key_list + [k])
  File "/home/dksingh/anaconda3/envs/maskrcnn_benchmark/lib/python3.7/site-packages/yacs/config.py", line 491, in _merge_a_into_b
    raise KeyError("Non-existent config key: {}".format(full_key))
KeyError: 'Non-existent config key: MODEL.RPN.EXTERNAL_PROPOSAL'

I think there have been package updates. Could you kindly share your environment's requirements file so that I can re-install it? I actually followed the steps from INSTALL.md 6 times.

CanPeng123 commented 3 years ago

Hi

The environments I am using are: CUDA: 10.1.243 gcc: 7.5.0 python: 3.6.10 yacs: 0.1.7

deepaksinghcv commented 3 years ago

What about pytorch, pytorch-nightly, and torchvision? Because when I install as per the instructions, there are some CUDA errors, which get resolved when upgraded to torchvision==0.4.0. But when I try to install gcc==7.5.0. They are incompatible with torchvision.

CanPeng123 commented 3 years ago

torch: 1.3.1 torchvision: 0.2.2

deepaksinghcv commented 3 years ago

I created a fresh environment with said packages:

gcc: 7.5.0
yacs: 0.1.7
torch: 1.3.1
torchvision: 0.2.2
python: 3.6.10
cuda: 10.1

The issue is still present.

WhatsApp Image 2021-03-26 at 15 03 49

WhatsApp Image 2021-03-26 at 15 00 17

CanPeng123 commented 3 years ago

Seems like your path is wrong. Your search path shows packages/yacs/config.py But the config file should be at Faster-ILOD/maskrcnn_benchmark/config/defaults.py.

Could you please double-check your code path?

CanPeng123 commented 3 years ago

Maybe you could try to run the original maskrcnn code for the first learning step (normal training) to double-check your environment setting. After that, replace the files inside maskrcnn_banchmark and configs folders with my code and use the train_incremental.py to run the following incremental steps.

deepaksinghcv commented 3 years ago

Thanks, I will try them and get back.

deepaksinghcv commented 3 years ago

I tried training using the original maskrcnn code. I did not face any issue with parsing the config file. Could you clarify as to what should be replaced?

CanPeng123 commented 3 years ago

Hi,

I cannot remember exactly which files I have modified. Sorry for the inconvenience. Could you please try to replace the files in the following folders:

Faster-ILOD/maskrcnn_benchmark/config Faster-ILOD/maskrcnn_benchmark/data Faster-ILOD/maskrcnn_benchmark/distillation Faster-ILOD/maskrcnn_benchmark/modeling Faster-ILOD/maskrcnn_benchmark/solver

deepaksinghcv commented 3 years ago

I somehow got it to train by moving and modifying some files. In the logs there are no metrics printed during inference. So I tried to perform inference using the following command:

python -m torch.distributed.launch --nproc_per_node=$NGPUS tools/test_net.py --config-file=./configs/e2e_faster_rcnn_R_50_C4_1x.yaml MODEL.RPN.FPN_POST_NMS_TOP_N_TRAIN 4000

and yet I don't see any log of metrics. There are just :

...
voc.py | incremental train | object category belongs to exclude categoires: tvmonitor
voc.py | incremental train | object category belongs to exclude categoires: tvmonitor
voc.py | incremental train | object category belongs to exclude categoires: sofa
voc.py | incremental train | object category belongs to exclude categoires: tvmonitor
voc.py | incremental train | object category belongs to exclude categoires: sofa
voc.py | incremental train | object category belongs to exclude categoires: sofa
voc.py | incremental train | object category belongs to exclude categoires: pottedplant
voc.py | incremental train | object category belongs to exclude categoires: pottedplant
voc.py | incremental train | object category belongs to exclude categoires: pottedplant
voc.py | incremental train | object category belongs to exclude categoires: pottedplant
voc.py | incremental train | object category belongs to exclude categoires: sofa
voc.py | incremental train | object category belongs to exclude categoires: train
voc.py | incremental train | object category belongs to exclude categoires: train
...

Could you kindly share the procedure to evaluate the model or check for metrics

CanPeng123 commented 3 years ago

Hi,

I use test_net.py to evaluate the model. After the whole test dataset is evaluated, the mAP results will be presented, the same as the original maskrcnn_benchmark. I do not modify the code for model performance evaluation on the VOC dataset. If the printed notes bothered you, you could comment them on the maskrcnn_benchmark/data/datasets/voc.py file.

Hope this could help you.

onepiece010938 commented 2 years ago

@deepakksingh I also faced the same problem like this during the training phase File "/miniconda/envs/py36/lib/python3.6/site-packages/yacs/config.py", line 478, in _merge_a_into_b _merge_a_into_b(v, b[k], root, key_list + [k]) File "/miniconda/envs/py36/lib/python3.6/site-packages/yacs/config.py", line 491, in _merge_a_into_b raise KeyError("Non-existent config key: {}".format(full_key)) KeyError: 'Non-existent config key: MODEL.RPN.EXTERNAL_PROPOSAL'

Could you kindly share how you solved it?

deepaksinghcv commented 2 years ago

Hey @onepiece010938, There was some conflict between yacs and fvcore packages for me. In used this : https://github.com/Majiker/BalancedMetaSoftmax-InstanceSeg/issues/3#issuecomment-778738124

onepiece010938 commented 2 years ago

Hi~@deepakksingh
Thanks for the quick response. After I run this command pip install fvcore==0.1.1.dev200512 ,I still face the same problem. Have you adjusted or moved any other folders?

root@d97721de4d1b:~/Faster-ILOD# cd /root/Faster-ILOD ; /usr/bin/env /miniconda/envs/py36/bin/python /root/.vscode-server/extensions/ms-python.python-2021.9.1246542782/pythonFiles/lib/python/debugpy/launcher 37235 -- /root/Faster-ILOD/tools/train_first_step.py Traceback (most recent call last): File "/root/Faster-ILOD/tools/train_first_step.py", line 236, in <module> main() File "/root/Faster-ILOD/tools/train_first_step.py", line 204, in main cfg.merge_from_file(args.config_file) File "/miniconda/envs/py36/lib/python3.6/site-packages/yacs/config.py", line 213, in merge_from_file self.merge_from_other_cfg(cfg) File "/miniconda/envs/py36/lib/python3.6/site-packages/yacs/config.py", line 217, in merge_from_other_cfg _merge_a_into_b(cfg_other, self, self, []) File "/miniconda/envs/py36/lib/python3.6/site-packages/yacs/config.py", line 478, in _merge_a_into_b _merge_a_into_b(v, b[k], root, key_list + [k]) File "/miniconda/envs/py36/lib/python3.6/site-packages/yacs/config.py", line 478, in _merge_a_into_b _merge_a_into_b(v, b[k], root, key_list + [k]) File "/miniconda/envs/py36/lib/python3.6/site-packages/yacs/config.py", line 491, in _merge_a_into_b raise KeyError("Non-existent config key: {}".format(full_key)) KeyError: 'Non-existent config key: MODEL.RPN.EXTERNAL_PROPOSAL'

onepiece010938 commented 2 years ago

I solved it , thanks a lot. I used this: https://github.com/facebookresearch/vilbert-multi-task/issues/25#issuecomment-627877265

$ git clone https://gitlab.com/vedanuj/vqa-maskrcnn-benchmark.git $ cd vqa-maskrcnn-benchmark/ $ python setup.py build develop

After re-build another "vqa-maskrcnn-benchmark" , I replaced some missing library or folder with the "maskrcnn-benchmark" provided by @CanPeng123 .