bitbank2 / JPEGDEC

An optimized JPEG decoder for Arduino
Apache License 2.0
365 stars 47 forks source link

Add SIMD support for ESP32S3 #56

Closed modi12jin closed 5 months ago

modi12jin commented 1 year ago

This passage was generated by chatGPT

ESP32-S3 is a high-performance, low-power microcontroller that supports SSE (SIMD) instruction set, which can complete multiple operations in one instruction, improving code efficiency and running speed. Here is a sample code using SIMD instructions on ESP32-S3:

 #include "xtensa/hal.h"
 #include "esp_simd.h"
 #include <stdio.h>

 void simd_example(void)
 {
     SIMDDATA s1 = {1.0f, 2.0f, 3.0f, 4.0f};
     SIMDDATA s2 = {5.0f, 6.0f, 7.0f, 8.0f};
     SIMDDATA s3 = {0.0f, 0.0f, 0.0f, 0.0f};

     // Add the numbers in each element of s1 and s2, and store the result in s3
     s3 = SIMD_ADD(s1, s2);

     // output each element in s3
     printf("%f, %f, %f, %f\r\n", s3.f32[0], s3.f32[1], s3.f32[2], s3.f32[3]);

     // Multiply the numbers in each element of s1 and s2, and store the result in s3
     s3 = SIMDMUL(s1, s2);

     // output each element in s3
     printf("%f, %f, %f, %f\r\n", s3.f32[0], s3.f32[1], s3.f32[2], s3.f32[3]);
 }

In this sample code, the SIMD_ADD() and SIMDMUL() functions are functions that use SIMD instructions to complete addition and multiplication operations, and the SIMDDATA type is a pointer, which is used to point to a vector array containing 4 floating-point number elements. Using these functions can greatly improve code efficiency and execution speed.

It should be noted that to use SIMD instructions on ESP32-S3, you need to include <xtensa/hal.h> header files and <esp_simd.h> header files, and use the -msimd option to enable SIMD instructions when compiling set support.

modi12jin commented 1 year ago

Maybe this passage will help you

https://github.com/espressif/idf-extra-components/issues/106

bitbank2 commented 1 year ago

I'm quite familiar with SIMD coding and would be happy to optimize my code for the S3, but I can't find the include files you referenced above. Do you have a working Github link to them?

modi12jin commented 1 year ago

@bitbank2 Thank you for your reply, maybe the directory file name has been changed, causing the address he gave to be invalid

https://github.com/espressif/esp-adf-libs/tree/master/esp_codec/include/codec

I can't find the header file esp_simd.h either, maybe this question helps

https://github.com/espressif/esp-idf/issues/7745

https://github.com/espressif/esp-dsp

modi12jin commented 1 year ago

I saw on twitter that they have introduced SIMD instructions in the technical reference manual

https://mobile.twitter.com/eMbeddedHome/status/1570520252123062274

https://mobile.twitter.com/lovyan03/status/1622846385438720002

bitbank2 commented 1 year ago

I saw these references months ago, but no concrete examples. I thought you had new information. I'll keep searching for this info and when it actually becomes available, I'll implement it. For now, writing in ESP32 assembly language is not going to happen.

modi12jin commented 1 year ago

Many thanks! Looking forward to your work.

modi12jin commented 1 year ago

@bitbank2 Contact Espressif's official staff, he said that there seems to be no fully open version of the SIMD data. There may be some clues hidden in esp-dsp.

bitbank2 commented 1 year ago

I would honestly like to work on this, but I have very little free time. It will need to be painless and well documented.

modi12jin commented 1 year ago

@bitbank2 i got some replies

https://github.com/espressif/esp-bsp/issues/154

modi12jin commented 1 year ago

@bitbank2 It should be possible to call the DSP like this from the Arduino.

https://github.com/espressif/esp-dsp/issues/11

https://github.com/espressif/arduino-esp32/issues/7710

#include <Arduino.h>
#include "dsps_biquad_gen.h" 

void setup() {
  Serial.begin(115200);
  float coeffs[15]={0},f=0.4,qFactor=4;
dsps_biquad_gen_lpf_f32(coeffs, f,  qFactor);

for (int i=0;i<15;i++){
  Serial.printf("%f \n",coeffs[i]);
}
}

void loop() {
}
bitbank2 commented 1 year ago

This DSP API library has been around for several years. It MAY be optimized for SIMD, but still doesn't really help any of my work.

modi12jin commented 9 months ago

@bitbank2 Sorry to bother you again! I got new news that this component supports SIMD. Officials told me that this only supports whole frame decoding and cannot be divided into blocks, and if you use this, the buffer used to decode seems to need to be 16-byte aligned.

https://github.com/espressif/esp-dev-kits/tree/master/esp32-s3-lcd-ev-board%2Fexamples%2Fusb_camera_lcd%2Fcomponents%2Fesp_jpeg

Components ported from ESP_ADF

bitbank2 commented 9 months ago

Unfortunately not helpful because they didn't release the source code.

modi12jin commented 9 months ago

@bitbank2 Sorry to bother you again, this may not be helpful, but I wanted to tell you the test results.

JPEG decoding with SIMD, currently the whole frame, cannot be partial, there will be more in the future

The only thing that needs attention is that the buffer must be 16-byte aligned. I tested 320240 with a box and it took an average of 42 ms to decode RGB565. The performance on Arduino is really not good. I remember that decoding 800480 under IDF took less than 50ms.

JPEGDEC seems to be 68ms

https://github.com/esp-arduino-libs/ESP32_JPEG/blob/master/examples/DecodeTest/DecodeTest.ino

bitbank2 commented 5 months ago

I worked on this over the weekend and got some good results optimizing my JPEG decoder. I'll publish the code soon. What I find strange is your first comment on this issue - you show instructions, include files, and things that don't actually exist in the ESP32-S3 instruction set. The SIMD instructions (according to Espressif's own documentation), only support integer operations and are somewhat limited. I'm continuing my search for more info, but so far the SIMD of the S3 is mostly disappointing.

modi12jin commented 5 months ago

@bitbank2 This is jpeg SIMD decoding, which is now partially supported. Sir, you can try it and see how it works

7_20231124_jpeg_block_decoder_esp32s3.zip

bitbank2 commented 5 months ago

I'm not interested in someone else's closed source code.