STMicroelectronics / STM32CubeG4

STM32Cube MCU Full Package for the STM32G4 series - (HAL + LL Drivers, CMSIS Core, CMSIS Device, MW libraries plus a set of Projects running on all boards provided by ST (Nucleo, Evaluation and Discovery Kits))
Other
182 stars 98 forks source link

Update __PKHBT and __PKHTB to improve intrinsics performance #39

Closed etheory closed 1 year ago

etheory commented 1 year ago

By commenting out PKHBT and PKHTB assembly versions, and instead using native C versions, you have absolutely crippled the performance of this header when using these instructions. In real world high performance code, I got a 6.58% speed improvement for the entire codebase JUST by making this change.

IMPORTANT INFORMATION

Contributor License Agreement (CLA)

etheory commented 1 year ago

Why was this change compared to the standard CMSIS headers even made in the first place? Was there a legitimate reason for it?

TOUNSTM commented 1 year ago

Hello @etheory,

Thank you for this report. We will get back to you as soon as we analyze it further. This may take some time. Thank you for your comprehension.

With regards,

etheory commented 1 year ago

@TOUNSTM what I am requesting is that you simply revert the changes you made to the mainline CMSIS library, since your changes make it perform very poorly. Hopefully it doesn't take very long to do that, thanks.

TOUNSTM commented 1 year ago

Hello @etheory,

Thank you for this report. Unfortunately we don't treat aspect related to cmsis_gcc.h in our GitHub repositories. These files come from Arm and we use them as is. So, these files are out of the scope of STMicroelectronics what about reporting this issue to the ARM GitHub?

Please allow me to close this issue now. Thank you again for your contribution.

With regards,

bryceschober commented 1 year ago

@TOUNSTM Sorry, you're just incorrect. If you actually look in the ARM Github repository to which you directed @etheory, you will see that its master version of cmsis_gcc.h has the assembler versions only, starting on line 2177: https://github.com/ARM-software/CMSIS_5/blob/master/CMSIS/Core/Include/cmsis_gcc.h#L2177

In fact, it looks like you are both right, and it's just that ST hasn't updated their CMSIS headers in approximately forever. I dug through the ARM repo history, and the change in commit 8a1fd913 looks like what @etheory is expecting to see here.

So the real question is: Why are the ST CMSIS headers so out-of-date, when ARM's versions have been stable/unchanged for a couple years now?

etheory commented 1 year ago

@TOUNSTM tagging you for a response. As demonstrated, what you say just isn't true. You need to update the CMSIS version to a newer one to get the performance improvement I outlined above. Thank you.