STMicroelectronics / stm32ai-modelzoo

AI Model Zoo for STM32 devices
Other
277 stars 68 forks source link

Error - Object Detection Training on 'ssd_mobilenet_v2_fpnlite' #48

Closed yoppy-tjhin closed 6 days ago

yoppy-tjhin commented 6 days ago

Hello,

Previously I have been able to run the training script and generate a model of 'st_ssd_mobilenet_v1', as in #47 Next, I am trying to train the model 'ssd_mobilenet_v2_fpnlite'. The only change from #47 is the model_type in the config file. my_config.txt

I got this error:

(stm32ai) G:\Kantor\Research\SLIFA\STMicro\ST_Model_Training\stm32ai-modelzoo\object_detection\src>python stm32ai_main.py --config-name my_user_config.yaml [INFO] : Setting upper limit of usable GPU memory to 10GBytes. [INFO] : Running training operation mode [INFO] : The random seed for this simulation is 127 [INFO] : To ensure the model is not overfitting or underfitting, it's crucial to evaluate its performance on a validation dataset during training. As no separate validation dataset was provided, we will split the dataset into an 80.0% training set and a 20.0% validation set. Loading 80.0% of the provided dataset as the Training dataset ... 0%| | 0/14 [00:00<?, ?it/s]images_path:../pascal_dataset/train/2007_000504_jpg.rf.d0a67292c8850ecf8305402f95c7daa3.jpg images_path:../pascal_dataset/train/2007_000392_jpg.rf.2371d0fde59eeb9b6a8d3f1e381ea3db.jpg images_path:../pascal_dataset/train/2007_000480_jpg.rf.adf26b6f20506b2b114ecb3c849101a3.jpg images_path:../pascal_dataset/train/2007_000346_jpg.rf.3b12409d64d6c12cc88542df962c9e26.jpg images_path:../pascal_dataset/train/2007_000464_jpg.rf.8295fa4cbcc2782802cac7dfc82b8233.jpg images_path:../pascal_dataset/train/2007_000243_jpg.rf.637dab3bc3adf92ba84fcbe8f37805ee.jpg images_path:../pascal_dataset/train/2007_000363_jpg.rf.48cb82e0cbe27efc69ebe1b4b3c46b4d.jpg images_path:../pascal_dataset/train/2007_000515_jpg.rf.d3db68425d96e88b8f8811585b0715f6.jpg images_path:../pascal_dataset/train/2007_000256_jpg.rf.02cb96ee20db070c880b4b909ccf0874.jpg images_path:../pascal_dataset/train/2007_000333_jpg.rf.10dc496466e7bc62531098819bcb5650.jpg images_path:../pascal_dataset/train/2007_000364_jpg.rf.935cacdb0f22aed817049c51b5da1e6b.jpg images_path:../pascal_dataset/train/2007_000452_jpg.rf.743b20d564817ed050001950de651abd.jpg 86%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋ | 12/14 [00:00<00:00, 109.70it/s]images_path:../pascal_dataset/train/2007_000250_jpg.rf.f0fe3dea5cb311aa77a48bdb159ceba3.jpg images_path:../pascal_dataset/train/2007_000332_jpg.rf.fcb9e3219402ff64f7784a12ec575f97.jpg 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 14/14 [00:00<00:00, 112.00it/s] Loading 20.0% of the provided dataset as the Validation dataset ... 0%| | 0/3 [00:00<?, ?it/s]images_path:../pascal_dataset/train/2007_000187_jpg.rf.2ab27aab673e7d40a76a53440f177c98.jpg images_path:../pascal_dataset/train/2007_000241_jpg.rf.e14a4e70f555b90ac0e90134b6021bf2.jpg images_path:../pascal_dataset/train/2007_000323_jpg.rf.2a52cc87871ca584b7495c7c0c92730c.jpg 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 95.99it/s] Loading Test dataset... 0%| | 0/7 [00:00<?, ?it/s]images_path:../pascal_dataset/test/2007_000027_jpg.rf.f198388eb768f32a89be553841611d3a.jpg images_path:../pascal_dataset/test/2007_000032_jpg.rf.453cf71521fb73718369a7f07a41433c.jpg images_path:../pascal_dataset/test/2007_000033_jpg.rf.83ab0d65cbcc0be92b649082c8a21ffb.jpg images_path:../pascal_dataset/test/2007_000061_jpg.rf.46ef80849da87d8113c0330be6ea5beb.jpg images_path:../pascal_dataset/test/2007_000068_jpg.rf.59f7be9df94d9d96fbe68b8d25df36f2.jpg images_path:../pascal_dataset/test/2007_000121_jpg.rf.807519c5d839ae8e10504ffb0c132f39.jpg images_path:../pascal_dataset/test/2007_000175_jpg.rf.89dd3b1705cf32b3b67f325b8b782a36.jpg 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:00<00:00, 112.00it/s] Dataset stats: classes: 20 training samples: 14 validation samples: 3 Test samples: 7 Error executing job with overrides: [] Traceback (most recent call last): File "G:\Kantor\Research\SLIFA\STMicro\ST_Model_Training\stm32ai-modelzoo\object_detection\src\stm32ai_main.py", line 254, in main() File "C:\Users\user\anaconda3\envs\stm32ai\lib\site-packages\hydra\main.py", line 94, in decorated_main _run_hydra( File "C:\Users\user\anaconda3\envs\stm32ai\lib\site-packages\hydra_internal\utils.py", line 394, in _run_hydra _run_app( File "C:\Users\user\anaconda3\envs\stm32ai\lib\site-packages\hydra_internal\utils.py", line 457, in _run_app run_and_report( File "C:\Users\user\anaconda3\envs\stm32ai\lib\site-packages\hydra_internal\utils.py", line 223, in run_and_report raise ex File "C:\Users\user\anaconda3\envs\stm32ai\lib\site-packages\hydra_internal\utils.py", line 220, in run_andreport return func() File "C:\Users\user\anaconda3\envs\stm32ai\lib\site-packages\hydra_internal\utils.py", line 458, in lambda: hydra.run( File "C:\Users\user\anaconda3\envs\stm32ai\lib\site-packages\hydra_internal\hydra.py", line 132, in run = ret.return_value File "C:\Users\user\anaconda3\envs\stm32ai\lib\site-packages\hydra\core\utils.py", line 260, in return_value raise self._return_value File "C:\Users\user\anaconda3\envs\stm32ai\lib\site-packages\hydra\core\utils.py", line 186, in run_job ret.return_value = task_function(task_cfg) File "G:\Kantor\Research\SLIFA\STMicro\ST_Model_Training\stm32ai-modelzoo\object_detection\src\stm32ai_main.py", line 234, in main process_mode(cfg, File "G:\Kantor\Research\SLIFA\STMicro\ST_Model_Training\stm32ai-modelzoo\object_detection\src\stm32ai_main.py", line 98, in process_mode train_glob(cfg, train_ds=train_ds, valid_ds=valid_ds, test_ds=test_ds, train_gen=train_gen, valid_gen=valid_gen) File "G:\Kantor\Research\SLIFA\STMicro\ST_Model_Training\stm32ai-modelzoo\object_detection\src./training\train.py", line 93, in train inference_model, trainingmodel, = get_model(cfg, class_names=cfg.dataset.class_names) File "G:\Kantor\Research\SLIFA\STMicro\ST_Model_Training\stm32ai-modelzoo\object_detection\src./utils\models_mgt.py", line 51, in get_model return ssd_mobilenet_v2_fpnlite(input_shape=cfg.training.model.input_shape, class_names=class_names, File "G:\Kantor\Research\SLIFA\STMicro\ST_Model_Training\stm32ai-modelzoo\object_detection\src./models\ssd_mobilenet_v2_fpnlite.py", line 289, in ssd_mobilenet_v2_fpnlite anchors_i, cls_preds_i, bbox_preds_i = fmap_forward(fmap_channels, version, layers_list[i], img_width, img_height, sizes[i], ratios[i],clip_boxes=clip_boxes, normalize=normalize, n_classes=n_classes,kernel=(3,3), l2_reg=l2_reg, bn=bn_pred, dw=True) File "G:\Kantor\Research\SLIFA\STMicro\ST_Model_Training\stm32ai-modelzoo\object_detection\src./models\ssd_mobilenet_v2_fpnlite.py", line 118, in fmap_forward anchorss = tf.tile(tf.expand_dims(anchors,0),(tf.shape(bbox_preds)[0],1,1,1,1)) File "C:\Users\user\anaconda3\envs\stm32ai\lib\site-packages\tensorflow\python\util\traceback_utils.py", line 153, in error_handler raise e.with_traceback(filtered_tb) from None File "C:\Users\user\anaconda3\envs\stm32ai\lib\site-packages\keras\layers\core\tf_op_layer.py", line 107, in handle return TFOpLambda(op)(*args, **kwargs)

.... ......

outputs = self.call(cast_inputs, *args, kwargs) File "C:\Users\user\anaconda3\envs\stm32ai\lib\site-packages\keras\layers\core\tf_op_layer.py", line 226, in _call_wrapper return self._call_wrapper(*args, *kwargs) File "C:\Users\user\anaconda3\envs\stm32ai\lib\site-packages\keras\layers\core\tf_op_layer.py", line 261, in _call_wrapper result = self.function(args, kwargs) File "C:\Users\user\anaconda3\envs\stm32ai\lib\site-packages\keras\engine\keras_tensor.py", line 255, in array f'You are passing {self}, an intermediate Keras symbolic input/output, ' File "C:\Users\user\anaconda3\envs\stm32ai\lib\site-packages\keras\engine\keras_tensor.py", line 297, in str return 'KerasTensor(type_spec=%s%s%s%s)' % ( RecursionError: maximum recursion depth exceeded

yoppy-tjhin commented 6 days ago

Sorry, I unintentionally modified the ssd_moiblenet_v2_fpn.py. It is okay now. I am waiting for the training, quantizing, and converting nb file to finish.