Closed LMESTM closed 3 years ago
What would be suggestion here? From the link in the #12116, this loos like it should be known issue with microlib.
It might be worth reaching out internal team who is in charge for microlib ?
For division, microlib is doing what's being advertised, which is being smaller and slower. And I'm inclined to think the trade-off is okay. It's 40 bytes instead of 340 bytes. 8 times slower (in this case), 8 times smaller. 8 times slower isn't enough to be that much of a problem, except when running at like 4MHz.
I'm more concerned about the 64-bit multiply (which the div1000
was using there). That is tens of times slower, AND many times bigger. It would be better in all regards to just use the standard library routine. We should raise this with the tools team.
That only seems to be a problem for M0/M0+/M23 - the microlib library routine for M3 is also catastrophically slow, but the compiler seems to not use it - it just inlines optimal code instead.
In the interim, we could paste in the standard library routines with #if MICROLIB
somewhere - that should override the currently-poor microlib definitions.
Note we're waiting from Arm to comment and confirm how to proceed.
Note this affects all Cortex-M0/M0+. @kjbracey-arm can you please review?
@kjbracey-arm any update on this?
This issue seems to also affect cortex M4 (STM32F303 for example). See #12880
A fix for this is now expected in a future ARM Compiler 6 update.
Current expectation is that future microlib will have the same smaller+faster 64-bit multiply as the standard library. No Mbed OS change required there.
It will default to the the current division operations, but the faster+bigger versions from the standard library will be selectable by an IMPORT
directive, which we'll have to add. That in turn could be put under a Mbed config option. So there will need to be a PR for that when the toolchain is available.
@LMESTM thank you for raising this issue.Please take a look at the following comments:
We cannot automatically identify a release based on the version of Mbed OS that you have provided. Please provide either a single valid sha of the form #abcde12 or #3b8265d70af32261311a06e423ca33434d8d80de or a single valid release tag of the form mbed-os-x.y.z . E.g. '2020.' has not been matched as a valid tag or sha. NOTE: If there are fields which are not applicable then please just add 'n/a' or 'None'.This indicates to us that at least all the fields have been considered. Please update the issue header with the missing information, the issue will not be mirroredto our internal defect tracking system or investigated until this has been fully resolved.
@LMESTM as this appears to be more of a compiler issue can we close this now ?
@LMESTM it has been 5 days since the last reminder. Could you please update the issue header as previously requested?
The latest release of Arm Compiler is 6.15 (this versin contains the fix). I reviewed the internal tickets, it should be in the release.
I'll close this now as resolved.
AFAIK, you will need to actually activate the fast division routines, by adding some sort of import directive somewhere. The libraries will default to slow.
(And there might be toolchain-backward-compatibility issues with that. Maybe you can make it a weak reference to not break old toolchains?)
From my comment above:
It will default to the the current division operations, but the faster+bigger versions from the standard library will be selectable by an
IMPORT
directive, which we'll have to add. That in turn could be put under a Mbed config option. So there will need to be a PR for that when the toolchain is available.
@LMESTM it has been 5 days since the last reminder. Could you please update the issue header as previously requested?
We will need to review the latest release and how to make them enabled.
@LMESTM it has been 5 days since the last reminder. Could you please update the issue header as previously requested?
@LMESTM it has been 5 days since the last reminder. Could you please update the issue header as previously requested?
@LMESTM it has been 5 days since the last reminder. Could you please update the issue header as previously requested?
@LMESTM it has been 5 days since the last reminder. Could you please update the issue header as previously requested?
@LMESTM it has been 5 days since the last reminder. Could you please update the issue header as previously requested?
@LMESTM it has been 5 days since the last reminder. Could you please update the issue header as previously requested?
@LMESTM it has been 5 days since the last reminder. Could you please update the issue header as previously requested?
@LMESTM it has been 5 days since the last reminder. Could you please update the issue header as previously requested?
Thank you for raising this detailed GitHub issue. I am now notifying our internal issue triagers. Internal Jira reference: https://jira.arm.com/browse/IOTOSM-2945
We closed this issue because it has been inactive for quite some time and we believe it to be low priority. If you think that the priority should be higher, then please reopen with your justification for increasing the priority.
Description of defect
Following investigations in #12116 , it turns out that division with microlib seems much slower than ARM lib (and possibly also other libs) on cores like cortex-M0.
Few measurement on a "divided by 1000" operation were measured and are reported here for background information:
Target(s) affected by this defect ?
ALL
Toolchain(s) (name and version) displaying this defect ?
microlib toolchain (µARM)
What version of Mbed-os are you using (tag or sha) ?
mbed-os master of Jan 2020.
What version(s) of tools are you using. List all that apply (E.g. mbed-cli)
Not relevant here.
How is this defect reproduced ?
See detailed history in #12116