HimaxWiseEyePlus / himax_tflm

Apache License 2.0
24 stars 15 forks source link

關於輸入的影像 #8

Closed gitE0Z9 closed 3 years ago

gitE0Z9 commented 3 years ago

您好,請問 hx_drv_sensor_capture抓到的影像是感測器的原始數值嗎? hx_drv_image_rescale是將原始數值作resize而已,還是有處理顏色呢? 如果需要RGB可以怎麼處理呢? 是將raw_address換成jpeg_address嗎? 可是這好像對顏色沒影響?

謝謝!

CKHSUinHimax commented 3 years ago

hx_drv_sensor_capture抓到的影像是感測器的原始數值嗎? ==> It is mono camera on this board. hx_drv_image_rescale是將原始數值作resize而已,還是有處理顏色呢? ==> only resize for mono image 如果需要RGB可以怎麼處理呢? ==> Only mono camera provided for now. 是將raw_address換成jpeg_address嗎? 可是這好像對顏色沒影響?, no color camera now, so it is all mono images. And JPEG compression for mono images.

gitE0Z9 commented 3 years ago

Thanks for your reply! Although somewhat disappointed, since most of models are pretrained on RGB data.

CKHSUinHimax commented 3 years ago

May I know what you want to do with this EVB? And what kind of model you want to deploy on it?

gitE0Z9 commented 3 years ago

We are going to deploy shufflenetv2 on this EVB and found that predictions made with the board are pretty different from what tflite predicted on PC.

Although the preprocessing are already added in main_functions.cc (normalized by 255 and quantizatized input w/ scale and zeropoint ), there is still no improvement at all, so I wonder that if the image array the camera reads in is different from opencv or PIL?

Would you kindly share how should we preprocess the image data so we can mimic the camera data? I observe that the order of grayscale and resize and normalization can affects the accuracy, which I have never considered before.

gitE0Z9 commented 3 years ago

We have found that it might have nothing to do with preprocessing like normalization and quantization, but the image itself. Here is the model output in different language:

tflite in python : -8  32  -8  58  32  42  99  14   8 -13
tflite in cpp : -10 32  -5  58  30  41  97  15   8 -11

They are pretty close and both test used image data preprocessed with opencv (read in grayscale then resize to 224 by 224), So I guess the problem is attributed to how the image data is captured and resized on EVB?

Could himax share some idea with us?

kris-himax commented 3 years ago

raw image (640, 480) type: (uint8_t) layout : start from left up corner (w0,h0) (w1,h0).....(w639,h0) (w0,h1) (w1,h1).....(w639,h479), w denotes width position, h denotes height position

rescale image to size: (96,96) type: (int8_t) at first row (w0,h0) (w1,h0)....(w95,h0). If you want to transfer to uint8_t, just plus 128 at each pixel -46, -72, -73, -76, -74, -79, -72, -69, -68, -70, -77, -75, -78, -74, -73, -73, -75, -74, -78, -81, -82, -85, -86, -85, -83, -88, -85, -85, -87, -88, -86, -97, -101, -100, -98, -95, -92, -93, -88, -84, -76, -65, -50, -40, -37, -34, -35, -33, -29, -46, -40, -37, -34, -37, -44, -49, -57, -67, -79, -85, -89, -92, -93, -91, -96, -97, -99, -94, -93, -93, -97, -95, -95, -97, -96, -97, -96, -95, -96, -95, -97, -96, -97, -95, -97, -97, -98, -94, -92, -90, -74, -74, -90, -93, -97, -94

gitE0Z9 commented 3 years ago

Hi, thanks for your kindly help!

We had found out where the problem lied in, it is because the quality of the pictures captured by the camera is pretty low. Most of part of the image are obscure and susceptible to the light condition while the resolution is already not high. Another drawback is that the board will become overheated when operated for a while, it results in the disconnection of the serial.

Conclusion: Hardware problem. The SDK shall be fine.

CKHSUinHimax commented 3 years ago

close