mit-han-lab / tinyengine

[NeurIPS 2020] MCUNet: Tiny Deep Learning on IoT Devices; [NeurIPS 2021] MCUNetV2: Memory-Efficient Patch-based Inference for Tiny Deep Learning; [NeurIPS 2022] MCUNetV3: On-Device Training Under 256KB Memory
https://mcunet.mit.edu
MIT License
757 stars 127 forks source link

Try to implement pose estimation through tinyengine #91

Open zjwfufu opened 1 year ago

zjwfufu commented 1 year ago

Hi, Thank you for your great work!

I wanted to reach out because I've been facing some difficulties while attempting to perform single-person estimation on the OPENMV4P. I have been following your person_detection_demo, which has been quite helpful.

I train a tiny movenet and only left its heatmap head in the COCO2017 dataset. It works fine enough for me in the tflite format. But when I implement it through tinyengine framework, the results are bad and different from the tflite's.

I debug with the same RGB picture(-128~127), turns out that the output is very different.

The generated code genModel.h shows that:

#define NNoutput &buffer0[98304]; /* sram:189584, flash:29544 */ static signed char buffer[189584]; static signed char *buffer0 = &buffer[0]; static signed char *buffer1 = &buffer[180224]; static int16_t *sbuf = (int16_t *)&buffer[180224]; static int32_t *kbuf = (int32_t *)&buffer[188936]; const int SBuffer_size = 8712; const int KBuffer_size = 648;

Shouldn't it be so big as to cause memory overlap? Since I have only one output to manage, I don't register detectionUtils like things in the codegenerator. So the NNoutput is captured by _findtheinferenceOutput. I think it is ok...

And I also try to set fp_requantize to False, and it did not help. At this point, I'm unsure of what could be causing the discrepancies, and I would greatly appreciate any guidance or suggestions that could help me move forward with this.

pose_tflite_model.zip