KeyError for custom data training from scratch

Hi, Thank you for your good work. I am trying to deploy your trident framework for my own dataset. I follow your "setup script" to install the simpledet, and I modified the configuration file " tridentnet_r101v2c4_c5_2x.py". Basically, I change the number of classes, the number of gpu, and the dataset. Then, I start my training by running: $ python3 detection_train.py --config config/tridentnet_r101v2c4_c5_2x.py But I get following error:

File "detection_train.py", line 139, in train_net sym, arg_params, aux_params = merge_bn(sym, arg_params, aux_params) File "/home/simpledet/utils/graph_optimize.py", line 77, in merge_bn gamma = mx.sym.var(node_name + "_gamma", shape=args[node_name + "_gamma"].shape) KeyError: 'bn0_gamma'

However, after I read this https://github.com/TuSimple/simpledet/issues/186 and comment two lines as the same as Issue 186, I can start my training.

But I do not know why I need to do that? I think I follow the setup scripts and I should install the newest version of the simpledet. Meanwhile, do we have side effect if I comment these two lines during the training? Thank you again!

The checkpoint is trained with a early version of simpledet which does not employ any computation graph optimization.

On Sun, Sep 15, 2019 at 5:50 AM Tairen Chen notifications@github.com wrote:

Hi, Thank you for your good work. I am trying to deploy your trident framework for my own dataset. I follow your "setup script" to install the simpledet, and I modified the configuration file " tridentnet_r101v2c4_c5_2x.py". Basically, I change the number of classes, the number of gpu, and the dataset. Then, I start my training by running: $ python3 detection_train.py --config config/tridentnet_r101v2c4_c5_2x.py But I get following error:

File "detection_train.py", line 139, in train_net sym, arg_params, aux_params = merge_bn(sym, arg_params, aux_params) File "/home/simpledet/utils/graph_optimize.py", line 77, in merge_bn gamma = mx.sym.var(node_name + "_gamma", shape=args[node_name + "_gamma"].shape) KeyError: 'bn0_gamma' [image: image] https://user-images.githubusercontent.com/32938376/64913924-e9420a00-d6fd-11e9-95ac-88909378124e.png

However, after I read this https://github.com/TuSimple/simpledet/issues/186 http://url and comment two lines as the same as Issue 186, I can start my training.

But I do not know why I need to do that? I think I follow the setup scripts and I should install the newest version of the simpledet. Meanwhile, do we have side effect if I comment these two lines during the training? Thank you again!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/TuSimple/simpledet/issues/235?email_source=notifications&email_token=ABGODH4XWU2TLDFKMRQ4RJDQJVMAXA5CNFSM4IWYLI22YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HLMZ26A, or mute the thread https://github.com/notifications/unsubscribe-auth/ABGODH6F4QVWLZS3HGSGET3QJVMAXANCNFSM4IWYLI2Q .

Thank you for your reply. In the " tridentnet_r101v2c4_c5_2x.py" I set the " from_scratch = True ", so I do not load the trained checkpoint (i.e.: not fine-tune from the existed checkpoint). Do we have any other reason for this KeyError? Thanks.

The checkpoint is trained with a early version of simpledet which does not employ any computation graph optimization. … On Sun, Sep 15, 2019 at 5:50 AM Tairen Chen @.***> wrote: Hi, Thank you for your good work. I am trying to deploy your trident framework for my own dataset. I follow your "setup script" to install the simpledet, and I modified the configuration file " tridentnet_r101v2c4_c5_2x.py". Basically, I change the number of classes, the number of gpu, and the dataset. Then, I start my training by running: $ python3 detection_train.py --config config/tridentnet_r101v2c4_c5_2x.py But I get following error: File "detection_train.py", line 139, in train_net sym, arg_params, aux_params = merge_bn(sym, arg_params, aux_params) File "/home/simpledet/utils/graph_optimize.py", line 77, in merge_bn gamma = mx.sym.var(node_name + "_gamma", shape=args[node_name + "_gamma"].shape) KeyError: 'bn0_gamma' [image: image] https://user-images.githubusercontent.com/32938376/64913924-e9420a00-d6fd-11e9-95ac-88909378124e.png However, after I read this #186 http://url and comment two lines as the same as Issue 186, I can start my training. But I do not know why I need to do that? I think I follow the setup scripts and I should install the newest version of the simpledet. Meanwhile, do we have side effect if I comment these two lines during the training? Thank you again! — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#235?email_source=notifications&email_token=ABGODH4XWU2TLDFKMRQ4RJDQJVMAXA5CNFSM4IWYLI22YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HLMZ26A>, or mute the thread https://github.com/notifications/unsubscribe-auth/ABGODH6F4QVWLZS3HGSGET3QJVMAXANCNFSM4IWYLI2Q .

I see, the merge bn is currently designed for the fixbn setting only. It does not make sense for bn folding if you are training from scratch. We will update the train script later.

On Mon, Sep 16, 2019 at 12:43 PM Tairen Chen notifications@github.com wrote:

The checkpoint is trained with a early version of simpledet which does not employ any computation graph optimization. … <#m-4924169821271173032> On Sun, Sep 15, 2019 at 5:50 AM Tairen Chen @.***> wrote: Hi, Thank you for your good work. I am trying to deploy your trident framework for my own dataset. I follow your "setup script" to install the simpledet, and I modified the configuration file " tridentnet_r101v2c4_c5_2x.py". Basically, I change the number of classes, the number of gpu, and the dataset. Then, I start my training by running: $ python3 detection_train.py --config config/tridentnet_r101v2c4_c5_2x.py But I get following error: File "detection_train.py", line 139, in train_net sym, arg_params, aux_params = merge_bn(sym, arg_params, aux_params) File "/home/simpledet/utils/graph_optimize.py", line 77, in merge_bn gamma = mx.sym.var(node_name + "_gamma", shape=args[node_name + "_gamma"].shape) KeyError: 'bn0_gamma' [image: image] https://user-images.githubusercontent.com/32938376/64913924-e9420a00-d6fd-11e9-95ac-88909378124e.png However, after I read this #186 https://github.com/TuSimple/simpledet/issues/186 http://url and comment two lines as the same as Issue 186, I can start my training. But I do not know why I need to do that? I think I follow the setup scripts and I should install the newest version of the simpledet. Meanwhile, do we have side effect if I comment these two lines during the training? Thank you again! — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#235 https://github.com/TuSimple/simpledet/issues/235?email_source=notifications&email_token=ABGODH4XWU2TLDFKMRQ4RJDQJVMAXA5CNFSM4IWYLI22YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HLMZ26A>, or mute the thread https://github.com/notifications/unsubscribe-auth/ABGODH6F4QVWLZS3HGSGET3QJVMAXANCNFSM4IWYLI2Q .

Thank you for your reply. In the " tridentnet_r101v2c4_c5_2x.py" I set the " from_scratch = True ", so I do not load the trained checkpoint (i.e.: not fine-tune from the existed checkpoint). Do we have any other reason for this KeyError? Thanks.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/TuSimple/simpledet/issues/235?email_source=notifications&email_token=ABGODH6LQHZRGLONHDHBIT3QJ4FITA5CNFSM4IWYLI22YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD6YCXMQ#issuecomment-531639218, or mute the thread https://github.com/notifications/unsubscribe-auth/ABGODH7QBO233NICNI4OTLLQJ4FITANCNFSM4IWYLI2Q .

Thank you, Yuntao!

I see, the merge bn is currently designed for the fixbn setting only. It does not make sense for bn folding if you are training from scratch. We will update the train script later. On Mon, Sep 16, 2019 at 12:43 PM Tairen Chen notifications@github.com wrote: … The checkpoint is trained with a early version of simpledet which does not employ any computation graph optimization. … <#m-4924169821271173032> On Sun, Sep 15, 2019 at 5:50 AM Tairen Chen @.***> wrote: Hi, Thank you for your good work. I am trying to deploy your trident framework for my own dataset. I follow your "setup script" to install the simpledet, and I modified the configuration file " tridentnet_r101v2c4_c5_2x.py". Basically, I change the number of classes, the number of gpu, and the dataset. Then, I start my training by running: $ python3 detection_train.py --config config/tridentnet_r101v2c4_c5_2x.py But I get following error: File "detection_train.py", line 139, in train_net sym, arg_params, aux_params = merge_bn(sym, arg_params, aux_params) File "/home/simpledet/utils/graph_optimize.py", line 77, in merge_bn gamma = mx.sym.var(node_name + "_gamma", shape=args[node_name + "_gamma"].shape) KeyError: 'bn0_gamma' [image: image] https://user-images.githubusercontent.com/32938376/64913924-e9420a00-d6fd-11e9-95ac-88909378124e.png However, after I read this #186 <#186> http://url and comment two lines as the same as Issue 186, I can start my training. But I do not know why I need to do that? I think I follow the setup scripts and I should install the newest version of the simpledet. Meanwhile, do we have side effect if I comment these two lines during the training? Thank you again! — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#235 <#235>?email_source=notifications&email_token=ABGODH4XWU2TLDFKMRQ4RJDQJVMAXA5CNFSM4IWYLI22YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HLMZ26A>, or mute the thread https://github.com/notifications/unsubscribe-auth/ABGODH6F4QVWLZS3HGSGET3QJVMAXANCNFSM4IWYLI2Q . Thank you for your reply. In the " tridentnet_r101v2c4_c5_2x.py" I set the " from_scratch = True ", so I do not load the trained checkpoint (i.e.: not fine-tune from the existed checkpoint). Do we have any other reason for this KeyError? Thanks. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#235?email_source=notifications&email_token=ABGODH6LQHZRGLONHDHBIT3QJ4FITA5CNFSM4IWYLI22YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD6YCXMQ#issuecomment-531639218>, or mute the thread https://github.com/notifications/unsubscribe-auth/ABGODH7QBO233NICNI4OTLLQJ4FITANCNFSM4IWYLI2Q .

tusen-ai / simpledet

KeyError for custom data training from scratch #235