Seeed-Studio / ModelAssistant

Seeed SenseCraft Model Assistant is an open-source project focused on embedded AI. 🔥🔥🔥
https://sensecraftma.seeed.cc/
Apache License 2.0
382 stars 46 forks source link

ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (2,) + inhomogeneous part. #208

Closed kyleaitgroup closed 5 months ago

kyleaitgroup commented 5 months ago

Describe the bug A clear and concise description of what the bug is.

Environment Environment you use when bug appears:

  1. Python version: 3.8
  2. PyTorch Version: 2.0.0+cu118
  3. MMCV Version: 2.0.1
  4. EdgeLab Version
  5. Code you run: python tools/train.py configs/fomo/fomo_person.py --cfg-options \work_dir=work_dirs/fomo_300 num_classes=1 epochs=300 height=416 width=416 data_root=datasets/coco_person/
  6. The detailed error

Traceback (most recent call last): File "tools/train.py", line 226, in main() File "tools/train.py", line 221, in main runner.train() File "/home/ubuntu/.local/lib/python3.8/site-packages/mmengine/runner/runner.py", line 1777, in train model = self.train_loop.run() # type: ignore File "/home/ubuntu/.local/lib/python3.8/site-packages/mmengine/runner/loops.py", line 96, in run self.run_epoch() File "/home/ubuntu/.local/lib/python3.8/site-packages/mmengine/runner/loops.py", line 112, in run_epoch self.run_iter(idx, data_batch) File "/home/ubuntu/.local/lib/python3.8/site-packages/mmengine/runner/loops.py", line 128, in run_iter outputs = self.runner.model.train_step( File "/home/ubuntu/Documents/ModelAssistant/sscma/models/detectors/fomo.py", line 99, in train_step losses = self._run_forward(data, mode='loss') # type: ignore File "/home/ubuntu/.local/lib/python3.8/site-packages/mmengine/model/base_model/base_model.py", line 361, in _run_forward results = self(data, mode=mode) File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, *kwargs) File "/home/ubuntu/Documents/ModelAssistant/sscma/models/detectors/fomo.py", line 70, in forward return self.loss(inputs, data_samples) File "/home/ubuntu/.local/lib/python3.8/site-packages/mmdet/models/detectors/single_stage.py", line 78, in loss losses = self.bbox_head.loss(x, batch_data_samples) File "/home/ubuntu/Documents/ModelAssistant/sscma/models/heads/fomo_head.py", line 115, in loss loss = self.loss_by_feat(pred, batch_gt_instances, batch_img_metas, batch_gt_instances_ignore) File "/home/ubuntu/Documents/ModelAssistant/sscma/models/heads/fomo_head.py", line 150, in loss_by_feat loss, cls_loss, bg_loss, P, R, F1 = multi_apply(self.lossFunction, preds, target) File "/home/ubuntu/.local/lib/python3.8/site-packages/mmdet/models/utils/misc.py", line 219, in multi_apply return tuple(map(list, zip(map_results))) File "/home/ubuntu/Documents/ModelAssistant/sscma/models/heads/fomo_head.py", line 188, in lossFunction P, R, F1 = self.get_pricsion_recall_f1(preds, data) File "/home/ubuntu/Documents/ModelAssistant/sscma/models/heads/fomo_head.py", line 229, in get_pricsion_recall_f1 site = np.sum([ti, po], axis=0) File "<__array_function__ internals>", line 200, in sum File "/home/ubuntu/anaconda3/envs/model2/lib/python3.8/site-packages/numpy/core/fromnumeric.py", line 2324, in sum return _wrapreduction(a, np.add, 'sum', axis, dtype, out, keepdims=keepdims, File "/home/ubuntu/anaconda3/envs/model2/lib/python3.8/site-packages/numpy/core/fromnumeric.py", line 86, in _wrapreduction return ufunc.reduce(obj, axis, dtype, out, passkwargs) ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (2,) + inhomogeneous part.

Additional context I'm using latest version of ModelAssistant .

after I change site = np.sum([ti, po], axis=0)in ModelAssistant/sscma/models/heads/fomo_head.pywith site = np.concatenate([ti, po], axis=0) it stat training,but I get another error when training the model

Traceback (most recent call last): File "tools/train.py", line 226, in main() File "tools/train.py", line 221, in main runner.train() File "/home/ubuntu/.local/lib/python3.8/site-packages/mmengine/runner/runner.py", line 1777, in train model = self.train_loop.run() # type: ignore File "/home/ubuntu/.local/lib/python3.8/site-packages/mmengine/runner/loops.py", line 102, in run self.runner.val_loop.run() File "/home/ubuntu/.local/lib/python3.8/site-packages/mmengine/runner/loops.py", line 371, in run self.run_iter(idx, data_batch) File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, kwargs) File "/home/ubuntu/.local/lib/python3.8/site-packages/mmengine/runner/loops.py", line 392, in run_iter self.evaluator.process(data_samples=outputs, data_batch=data_batch) File "/home/ubuntu/.local/lib/python3.8/site-packages/mmengine/evaluator/evaluator.py", line 60, in process metric.process(data_batch, _data_samples) File "/home/ubuntu/Documents/ModelAssistant/sscma/evaluation/fomo_metric.py", line 74, in process tp, fp, fn = multi_apply(self.compute_ftp, preds, target) File "/home/ubuntu/.local/lib/python3.8/site-packages/mmdet/models/utils/misc.py", line 219, in multi_apply return tuple(map(list, zip(map_results))) File "/home/ubuntu/Documents/ModelAssistant/sscma/evaluation/fomo_metric.py", line 48, in compute_ftp confusion = confusion_matrix( File "/home/ubuntu/.local/lib/python3.8/site-packages/sklearn/utils/_param_validation.py", line 214, in wrapper return func(args, kwargs) File "/home/ubuntu/.local/lib/python3.8/site-packages/sklearn/metrics/_classification.py", line 326, in confusion_matrix y_type, y_true, y_pred = _check_targets(y_true, y_pred) File "/home/ubuntu/.local/lib/python3.8/site-packages/sklearn/metrics/_classification.py", line 84, in _check_targets check_consistent_length(y_true, y_pred) File "/home/ubuntu/.local/lib/python3.8/site-packages/sklearn/utils/validation.py", line 407, in check_consistent_length raise ValueError( ValueError: Found input variables with inconsistent numbers of samples: [144, 24]

Additional context if i set tp = fp = fn = 0 ,it works. but when I convert the model to binary and deploy it on the ESP32-S3 Eye, I encounter the following errors:

When I convert the model to binary and deploy it on the ESP32-S3 Eye, I encounter the following errors:

Didn't find op for builtin opcode 'TRANSPOSE' Failed to get registration from op code TRANSPOSE AllocateTensors() failed

and

W (4927) cam_hal: Failed to get the frame on time!

ERROR A stack overflow in task app_camera has been detected.

Backtrace: 0x40375b5e:0x3fca8680 0x4037cc1d:0x3fca86a0 0x4037f64a:0x3fca86c0 0x4037e2bf:0x3fca8740 0x4037f758:0x3fca8760 0x4037f74e:0xa5a5a5a5 |<-CORRUPTED 0x40375b5e: panic_abort at /home/ubuntu/esp/v5.1.2/esp-idf/components/esp_system/panic.c:452 0x4037cc1d: esp_system_abort at /home/ubuntu/esp/v5.1.2/esp-idf/components/esp_system/port/esp_system_chip.c:84 0x4037f64a: vApplicationStackOverflowHook at /home/ubuntu/esp/v5.1.2/esp-idf/components/freertos/FreeRTOS-Kernel/portable/xtensa/port.c:581 0x4037e2bf: vTaskSwitchContext at /home/ubuntu/esp/v5.1.2/esp-idf/components/freertos/FreeRTOS-Kernel/tasks.c:3729 0x4037f758: _frxt_dispatch at /home/ubuntu/esp/v5.1.2/esp-idf/components/freertos/FreeRTOS-Kernel/portable/xtensa/portasm.S:450 0x4037f74e: _frxt_int_exit at /home/ubuntu/esp/v5.1.2/esp-idf/components/freertos/FreeRTOS-Kernel/portable/xtensa/portasm.S:245

LynnL4 commented 5 months ago

Hi, Which deployment code are you using, it looks like the registration of the TRANSPOSE operator is missing. https://github.com/Seeed-Studio/SSCMA-Micro/blob/dd57e1cdd0317b7ae97838ae53fbd18f92acb020/porting/espressif/el_config_porting.h#L69 also thanks for trying, we will fix the training issue you mentioned as soon as possible, thanks!

kyleaitgroup commented 5 months ago

Hi, Thanks for the quick response!

I'm currently using https://github.com/Seeed-Studio/sscma-example-esp32

I've checked that the line#define CONFIG_EL_TFLITE_OP_TRANSPOSE does exist in the code, but the error persists.

LynnL4 commented 5 months ago

LynnL4 commented 21 minutes ago Do you use this example? I will check it https://github.com/Seeed-Studio/sscma-example-esp32/blob/main/examples/fomo_detection_demo/main/app_main.cpp

kyleaitgroup commented 5 months ago

Yes ,but I didn't notice that I'm not using the latest version of sscma-example-esp32. I'll try it first .

LynnL4 commented 5 months ago

Hi, If you're using an older version of the repository, perhaps the problem is here https://github.com/Seeed-Studio/sscma-example-esp32/blob/d926450d530216340c46a0daf0f9435294abce27/components/modules/algorithm/algo_fomo.cpp#L223

kyleaitgroup commented 5 months ago

Thanks. I've tested the latest version, but unfortunately, I encountered a new error.

Failed to resize buffer. Requested: 2092448, available 1044760, missing: 1047688 [ASSERT] Failed assertion 'interpreter != nullptr' E (5922) task_wdt: Task watchdog got triggered. The following tasks/users did not reset the watchdog in time: E (5922) task_wdt: - IDLE (CPU 0) E (5922) task_wdt: Tasks currently running: E (5922) task_wdt: CPU 0: main E (5922) task_wdt: CPU 1: IDLE E (5922) task_wdt: Print CPU 0 (current core) backtrace

Could it be caused by setting the epoch too large? I set the epoch to 300.

LynnL4 commented 5 months ago

I see that your configuration is height=416 width=416, this will cause the model to require a lot of running memory, suggest changing it to 192x192

MILK-BIOS commented 5 months ago

Hi @kyleaitgroup, the problem in training is caused by different types of ti and po. We fixed it in the latest version. Thank you!