Emill / P256-cortex-ecdh

P256 ECDH for Cortex-M0 and Cortex-M4
BSD 2-Clause "Simplified" License
20 stars 5 forks source link

RISC-V SUPPORT #3

Open smartmx opened 2 years ago

smartmx commented 2 years ago

I'm trying to adapt this project to the RISC-V platform, but RISC-V doesn't have overflow handling when computing. It means that the assembly on Cortex cannot be simply replaced and then run on RISC-V. My current idea is to create a global variable for storing the overflow flag, but this will greatly reduce the calculation speed. Do you have any good solution? Thanks!

Emill commented 2 years ago

The RISC-V platform is quite terrible when it comes to efficient big integer multiplication due to the lack of a carry flag and instructions like umaal.

There is a very long thread discussing this at https://groups.google.com/a/groups.riscv.org/g/isa-dev/c/Prak1SLbys8. You will find one post explaining how to rewrite one umaal instruction using eight instructions. If you really want to port this project to RISC-V, that could be one way to make that possible.

The "solution" some vendors offer is a hw accelerator for either big integer multiplication, or a hw accelerator directly for ECC calculations. Espressif use this for their RISC-V devices, as an example.

In my opinion both MIPS and RISC-V instruction sets are inferior compared to ARM instruction set. I guess RISC-V is used only to avoid the license fee of ARM.

smartmx commented 2 years ago

Thanks for your help! I will try it.

smartmx commented 2 years ago

Hi, Emill.

I have some questions.

In function P256_mul64: Your comment is : / in: (t0,t1) = a[0..1], (a2,a3) = b[0..1] out: a0-a3 /

Is t0 the High uint32 of data or t1 was? t0 = (data>>32) or t0 = data&0xffffffff ?

Is a2 the High uint32 of data or a3 was? a2 = (data>>32) or a2 = data&0xffffffff ?

the out data a0-a3, which of them is the high uint32 of data?

the out128 data = a0<<96| a1 <<64 | a2 << 32 | a3 or the out128 data = a3<<96| a2 <<64 | a1 << 32 | a0

Thanks!

smartmx commented 2 years ago

Sorry, above comment has some error. That is porting on RISV-V......

In function P256_mul64: Your comment is : / // in: (r4,r5) = a[0..1], (r2,r3) = b[0..1] // out: r0-r3 /

Is r4the High uint32 of data or r5 was? r4 = (data>>32) or r4 = data&0xffffffff ?

Is r2 the High uint32 of data or r3 was? r2 = (data>>32) or r2 = data&0xffffffff ?

the out data r0-r3, which of them is the high uint32 of data?

the out128 data = r0<<96 | r1 <<64 | r2 << 32 | r3 or the out128 data = r3<<96 | r2 <<64 | r1 << 32 | r0

Thanks!

Emill commented 2 years ago

the out128 data = r3<<96 | r2 <<64 | r1 << 32 | r0

Everything is little endian, so lower named registers contain lower bits.

Note that risc-v has the umulh instruction, which you should use to get the high bits of a 32-bit multiplication. Cortex-m0 lacks such an instruction which results in this quite large workaround as you can see.

smartmx commented 2 years ago

Yes, I am Using mulhu to make P256_mul64 too much easy. About 25 lines of codes. Thank you.