Open zkbitcoin opened 4 months ago
adding to assembly (right before pushing data on rdi and returning fixes it) will do pull requests later
;comparison cmp r15,[modulus + 24] jc done jnz sq cmp r14,[modulus + 16] jc done jnz sq cmp r13,[modulus + 8] jc done jnz sq cmp r12,[modulus + 0] jc done jnz sq sq: sub r12,[modulus + 0] sbb r13,[modulus + 8] sbb r14,[modulus + 16] sbb r15,[modulus + 24] done:
during test with large limbs found output data is wrong (see simple test bed at https://github.com/zkbitcoin/nasm-adx)
running example will create following output (asm(MUL is from projects asm_macros.hpp) case of overflow most likely see limbs_a[0] 9293073166814171452ULL
input limbs:
uint64_t limbs_r[4] = {}; uint64_t limbs_a[4] = {9293073166814171452ULL,4158907695144192454,2644031866505052884,3024693275553353487}; uint64_t limbs_b[4] = {2812702673390851119,5479905877917956870,1104182671213310543,818574998703379345};
generates:
asm(MUL
limbs_r[0] is 12178871726809496723 limbs_r[1] is 13840435079915171493 limbs_r[2] is 16771051252808782701 limbs_r[3] is 3578015002697288320
calculation by hand of Montgomery multiplier should generate (this is correct output)
limbs_r[0] is 7846254855529840460 limbs_r[1] is 2923310935437288472 limbs_r[2] is 3489859301534087952 limbs_r[3] is 91016735894317655