Open doegox opened 5 years ago
thumb for saving space, ok.
BTW slurdge is on it, he did a first test moving everything to thumb, gained sth like 5 to 10%, not much, but this needs more test to make sure nothing breaks and to see if there are better tuning to do
5-10% 12.5kb to 25kb.. Not too shabby I doubt some attack path with timecritical components like if thumb is slower..
thumb is not necessarily slower if the instructions fit nicely in the thumb encoding. it may even be faster (because the decode step is faster). The thing to watch for is the size of generated code which can be larger in specific sections of functions. With thumb I go down to ~81% on a 256k board, and basic testing doesn't show any differences. Of course it would be nice to have some benchmark :)
Yes, we have two issues.
btw, I am very happy to see you involved @slurdge !
@slurdge Did you do some benchmark?
We would need to make a decision about this one. Either thumb or keep as it is.
I just tried regular command and it seems to work. I would be happy to try a benchmark but I'm not aware of any time measurement methods. My intuition would be to move almost everything (i.e groups) to thumb.
we could move all LF stuff to thumb. Not much high tech stuff going on there.. well besides hitag2/s code.
Just to add my lacky thought : we're talking about thumb but thumb we don't actually use since we use thumb-interwork to make it work along with arm. How much may this break our assumptions here?
BTW We could try to enforce thumb as much as possible and rely on the NetBSD way of finding thumb-incompatibilities :
In a large codebase like NetBSD it becomes difficult to manually check if any one object file can be compiled to thumb mode. Luckily brute force works with the help of make option -k, as in keep going even one object file does not compile. By compiling whole tree with CPUFLAGS=-mthumb and MAKEFLAGS=-k, all of the build time failing machine dependent object files can be found, and marked with the help of Per file build options override to be compiled to ARM mode with thumb interworking.
@slurdge can you provide a case where thumb is actually faster than arm? In every benchmark I've tried, thumb2 can produce 85% (worst case) to 125%(famous faster than arm cases) the performance of arm for an average of 95%, But whatever the case, thumb (not thumb2) never did better then 83% arm performance with words case being 70% and average (11EEBMCs) score being 72% arm performance.
Now that we have 512kb to play with, the size isn't super important.
not everybody has 512, and RRG is open to everybody ;)
Doesn't mean we have to have a working 256kb fullimage.... although that would be a nice thing to offer.
@cjbrigato This is from memory when I was working on similar processor inside Nintendo DS. On my personal repo, I moved almost everything to thumb and nothing broke (in my rather limited test cases). And I would like very much push for a 256K basic image :-) We can use the 512K for more advanced cases.
Exactly. Let's try staying <256k for PLATFORM!=PM3RDV4 (so without flash, spiffs, smartcard and usart)
@slurdge i'm quite sure these processors came with THUMB2, as you've made reference to post-2003 architectures. But here we are on arm7tdmi and thumb is absolutely not Thumb2 unfortunately, and thumb is what we do when we -mthumb
everything.
I found some slides with the benchmark I was sure I was remembering correctly : https://elinux.org/images/8/8a/Experiment_with_Linux_and_ARM_Thumb-2_ISA.pdf check slide N14 : Thumb-2 Performance for original thumb comparison performance wise.
So Here I think benchmark are to be done.
and About the thumb compatibility : everything absolutely compiles and run in full -mthumb
without the interwork. In such case; a HF_COLIN + BT_ADDON Rdv4 fullimage is reduced to 240k
down from 278k
In comparison, a full-arm -mno-thumb-interwork
full image is 319k
.
At this stage, we are not able to run such an image.
We would still need the interwork code, drastically reducing the interest.
If we are to permit for some reason such an image, then the whole firmware has to be arm, including bootrom and any ASM part which would have been made THUMB-only.
Confirmed :
arm-none-eabi-readelf -a obj/fullimage.stage1.elf|grep Thumb
Tag_THUMB_ISA_use: Thumb-1
So let's have in mind we are talking about Thumb-1 here, so not anything close to what @slurdge was talking about.
Brace yourself, i'm ready to flash a bootrom in arm without thumb interwork (and as Thumb-enabled arm cpus indeed boot in ARM mode, this imply I will able to actually benchmark a true full thumb vs Full arm mode).
I can smell the bricking around.
It dit not break. But it break the jump to no have the start.c in thumb mode. I spotted the bootrom enforcing of the jump :
__asm("bx %0\n" : : "r"(((int)&_osimage_entry) | 0x1));
So I guess I have another run of bootrom flashing to make :'(
It works. Will now bench everything like my life depends on answering the ARM vs Thumb-1 question.
:-) I wasn't as extreme as you, just moved stuff from the ARMSRC to THUMBSRC. But I'm glad it works!
This got a turn ...almost 10months later we pushed some of this suggestion.
So we moved alot... its down to @cjbrigato to state what we need to do ;)
Thumb ISA is more limited, instructions are 2x smaller but you sometimes need more instructions, so, rule of thumb ðŸ¤:
Thumb = more compact code, expect like 30% gain ARM = faster code
=> keep ARM for speed-critical or time-critical code