zcash / pasta_curves

Rust implementation for zcash/pasta
Other
80 stars 49 forks source link

improve double and refactoring #44

Open ashWhiteHat opened 2 years ago

ashWhiteHat commented 2 years ago

Abstract

Hi there. I improve the double method with bit shift and commonalize the field operation.

What I did

I would appreciate it if you could confirm. Thank you!

str4d commented 1 year ago

I want to take the first commit from this branch but not (yet) the second. I don't really want to maintain a second macro-based field implementation (the other being the proc-macro-based ff_derive). We should think about how we want to make the field impls more maintainable going forward, whether that be this declarative macro, ff_derive, something based on crypto-bigint, or another approach. But that should be separated from the doubling improvement.

str4d commented 1 year ago

I've opened #49 for discussing code de-duplication approaches. In the meantime, I will open another PR with the first commit from this branch, so we can get it reviewed.

str4d commented 1 year ago

Per https://github.com/zcash/pasta_curves/pull/50#issuecomment-1276980492, the manual bitshift-based doubling does not appear to be more efficient than the self.add(self) doubling.

ashWhiteHat commented 1 year ago

Hi @str4d Thank you for the review.

I think we can speed up if we implement as following. https://github.com/zero-network/zero/blob/master/primitive/crypto/src/arithmetic/limbs/bits_256/normal.rs#L29

fn dbc do 1 right shift, and return high and low. It can optimize double and square as well. https://github.com/zero-network/zero/blob/master/primitive/crypto/src/arithmetic/limbs/bits_256/normal.rs#L64

I am going to implement these if you think it's good idea.

ashWhiteHat commented 1 year ago
double Before ```asm ➜ pasta_curves git:(main) rustc +stable --version rustc 1.66.1 (90743e729 2023-01-10) ➜ pasta_curves git:(main) cargo +stable asm --lib "::double" ::double: push rbp mov rbp, rsp push rbx mov rax, qword, ptr, [rsi] mov r11, qword, ptr, [rsi, +, 8] lea r10, [rax, +, rax] mov r9, qword, ptr, [rsi, +, 16] mov rsi, qword, ptr, [rsi, +, 24] xor edx, edx movabs r8, 7409212017489215487 add r8, r10 adc rdx, -1 shrd rax, r11, 63 sar rdx, 63 add rax, rdx adc rdx, 0 movabs r10, -2469829653914515739 add r10, rax adc rdx, -1 shrd r11, r9, 63 sar rdx, 63 add r11, rdx adc rdx, 0 shld rsi, r9, 1 sar rdx, 63 add rsi, rdx adc rdx, 0 movabs r9, -4611686018427387904 add r9, rsi adc rdx, -1 movabs rsi, -7409212017489215487 and rsi, rdx movabs rcx, 2469829653914515739 and rcx, rdx movabs rbx, 4611686018427387904 and rbx, rdx add rsi, r8 adc rcx, r10 adc r11, 0 mov rax, rdi adc rbx, r9 mov qword, ptr, [rdi], rsi mov qword, ptr, [rdi, +, 8], rcx mov qword, ptr, [rdi, +, 16], r11 mov qword, ptr, [rdi, +, 24], rbx pop rbx pop rbp ret ```
double After ```asm ➜ pasta_curves git:(feat/improve-double-and-refactoring) rustc +stable --version rustc 1.66.1 (90743e729 2023-01-10) ➜ pasta_curves git:(feat/improve-double-and-refactoring) cargo +stable asm --lib "::double" ::double: push rbp mov rbp, rsp mov rax, qword, ptr, [rsi] mov r9, qword, ptr, [rsi, +, 8] mov rcx, r9 shld rcx, rax, 1 add rax, rax mov rdx, qword, ptr, [rsi, +, 16] shrd r9, rdx, 63 mov rsi, qword, ptr, [rsi, +, 24] shld rsi, rdx, 1 xor edx, edx movabs r8, 7409212017489215487 add r8, rax adc rdx, -1 sar rdx, 63 add rcx, rdx adc rdx, 0 movabs rax, -2469829653914515739 add rax, rcx adc rdx, -1 sar rdx, 63 add r9, rdx adc rdx, 0 sar rdx, 63 add rsi, rdx adc rdx, 0 movabs r10, -4611686018427387904 add r10, rsi adc rdx, -1 movabs r11, -7409212017489215487 and r11, rdx movabs rcx, 2469829653914515739 and rcx, rdx movabs rsi, 4611686018427387904 and rsi, rdx add r11, r8 adc rcx, rax adc r9, 0 mov rax, rdi adc rsi, r10 mov qword, ptr, [rdi], r11 mov qword, ptr, [rdi, +, 8], rcx mov qword, ptr, [rdi, +, 16], r9 mov qword, ptr, [rdi, +, 24], rsi pop rbp ret ```
_ Before After
push 2 1
mov 10 11
add 7 8
adc 9 9
xor 1 1
movabs 6 6
shrd 2 1
shld 1 2
lea 1 0
sar 3 3
and 3 3
pop 2 1
ret 1 1
SUM 48 47

soooo slightly improved 😅 It seems near to refactoring.