RfidResearchGroup / proxmark3

Iceman Fork - Proxmark3
http://www.icedev.se
GNU General Public License v3.0
4.05k stars 1.07k forks source link

More ARM -> Thumb ? #863

Open doegox opened 5 years ago

doegox commented 5 years ago

Thumb ISA is more limited, instructions are 2x smaller but you sometimes need more instructions, so, rule of thumb 🤭:

Thumb = more compact code, expect like 30% gain ARM = faster code

=> keep ARM for speed-critical or time-critical code

iceman1001 commented 5 years ago

thumb for saving space, ok.

doegox commented 5 years ago

BTW slurdge is on it, he did a first test moving everything to thumb, gained sth like 5 to 10%, not much, but this needs more test to make sure nothing breaks and to see if there are better tuning to do

iceman1001 commented 5 years ago

5-10% 12.5kb to 25kb.. Not too shabby I doubt some attack path with timecritical components like if thumb is slower..

slurdge commented 5 years ago

thumb is not necessarily slower if the instructions fit nicely in the thumb encoding. it may even be faster (because the decode step is faster). The thing to watch for is the size of generated code which can be larger in specific sections of functions. With thumb I go down to ~81% on a 256k board, and basic testing doesn't show any differences. Of course it would be nice to have some benchmark :)

iceman1001 commented 5 years ago

Yes, we have two issues.

  1. flashing above 256kb limit doesn't work. Current flasher bricks. You need to jtag.
  2. if we don't loose speed , and get smaller, that is good for all 256kb devices. ie non-rdv4.

btw, I am very happy to see you involved @slurdge !

iceman1001 commented 5 years ago

@slurdge Did you do some benchmark?

We would need to make a decision about this one. Either thumb or keep as it is.

slurdge commented 5 years ago

I just tried regular command and it seems to work. I would be happy to try a benchmark but I'm not aware of any time measurement methods. My intuition would be to move almost everything (i.e groups) to thumb.

iceman1001 commented 5 years ago

we could move all LF stuff to thumb. Not much high tech stuff going on there.. well besides hitag2/s code.

cjbrigato commented 5 years ago

Just to add my lacky thought : we're talking about thumb but thumb we don't actually use since we use thumb-interwork to make it work along with arm. How much may this break our assumptions here?

cjbrigato commented 5 years ago

BTW We could try to enforce thumb as much as possible and rely on the NetBSD way of finding thumb-incompatibilities :

In a large codebase like NetBSD it becomes difficult to manually check if any one object file can be compiled to thumb mode. Luckily brute force works with the help of make option -k, as in keep going even one object file does not compile. By compiling whole tree with CPUFLAGS=-mthumb and MAKEFLAGS=-k, all of the build time failing machine dependent object files can be found, and marked with the help of Per file build options override to be compiled to ARM mode with thumb interworking.

cjbrigato commented 5 years ago

@slurdge can you provide a case where thumb is actually faster than arm? In every benchmark I've tried, thumb2 can produce 85% (worst case) to 125%(famous faster than arm cases) the performance of arm for an average of 95%, But whatever the case, thumb (not thumb2) never did better then 83% arm performance with words case being 70% and average (11EEBMCs) score being 72% arm performance.

iceman1001 commented 5 years ago

Now that we have 512kb to play with, the size isn't super important.

doegox commented 5 years ago

not everybody has 512, and RRG is open to everybody ;)

iceman1001 commented 5 years ago

Doesn't mean we have to have a working 256kb fullimage.... although that would be a nice thing to offer.

slurdge commented 5 years ago

@cjbrigato This is from memory when I was working on similar processor inside Nintendo DS. On my personal repo, I moved almost everything to thumb and nothing broke (in my rather limited test cases). And I would like very much push for a 256K basic image :-) We can use the 512K for more advanced cases.

doegox commented 5 years ago

Exactly. Let's try staying <256k for PLATFORM!=PM3RDV4 (so without flash, spiffs, smartcard and usart)

cjbrigato commented 5 years ago

@slurdge i'm quite sure these processors came with THUMB2, as you've made reference to post-2003 architectures. But here we are on arm7tdmi and thumb is absolutely not Thumb2 unfortunately, and thumb is what we do when we -mthumb everything.

I found some slides with the benchmark I was sure I was remembering correctly : https://elinux.org/images/8/8a/Experiment_with_Linux_and_ARM_Thumb-2_ISA.pdf check slide N14 : Thumb-2 Performance for original thumb comparison performance wise.

So Here I think benchmark are to be done.

and About the thumb compatibility : everything absolutely compiles and run in full -mthumb without the interwork. In such case; a HF_COLIN + BT_ADDON Rdv4 fullimage is reduced to 240k down from 278k

In comparison, a full-arm -mno-thumb-interwork full image is 319k. At this stage, we are not able to run such an image. We would still need the interwork code, drastically reducing the interest. If we are to permit for some reason such an image, then the whole firmware has to be arm, including bootrom and any ASM part which would have been made THUMB-only.

cjbrigato commented 5 years ago

Confirmed :

 arm-none-eabi-readelf -a obj/fullimage.stage1.elf|grep Thumb
  Tag_THUMB_ISA_use: Thumb-1

So let's have in mind we are talking about Thumb-1 here, so not anything close to what @slurdge was talking about.

cjbrigato commented 5 years ago

Brace yourself, i'm ready to flash a bootrom in arm without thumb interwork (and as Thumb-enabled arm cpus indeed boot in ARM mode, this imply I will able to actually benchmark a true full thumb vs Full arm mode).

I can smell the bricking around.

cjbrigato commented 5 years ago

It dit not break. But it break the jump to no have the start.c in thumb mode. I spotted the bootrom enforcing of the jump :

        __asm("bx %0\n" : : "r"(((int)&_osimage_entry) | 0x1));

So I guess I have another run of bootrom flashing to make :'(

cjbrigato commented 5 years ago

It works. Will now bench everything like my life depends on answering the ARM vs Thumb-1 question.

slurdge commented 5 years ago

:-) I wasn't as extreme as you, just moved stuff from the ARMSRC to THUMBSRC. But I'm glad it works!

iceman1001 commented 4 years ago

This got a turn ...almost 10months later we pushed some of this suggestion.

iceman1001 commented 4 years ago

So we moved alot... its down to @cjbrigato to state what we need to do ;)