I reviewed the BoringSSL codebase (and compared our 25519 implementation), and they have:
[ ] more tables (we have in curve25519_tables.h whatever is used if "OPENSSL_SMALL" is defined in BoringSSL), but there's as well a const uint8_t k25519Precomp[32][8][3][32]. we may want to measure performance, and size if we're keen on performance - see #196 https://github.com/hannesm/mirage-crypto/tree/25519-big-table (looks like it's not an enhancement on all CPUs -- a slowdown on AMD Ryzen 7 3700X (3.6GHz) and AMD Ryzen 9 7950X (4.5 GHz) by 0.8, a speedup on i7-5600U, 2.60GHz by 1.7)
[ ] ADX for base-point multiplication (according to 9d4f833eec8205e7ad257fb7e7cb321270d3e3cb there's around +25% in Ed25519 key generation and signing operations)
[ ] NEON (ARM) if someone cares about ARM processors
[x] Simpler square-root computation for Ed25519 (commit 0fc57bef1821c163ac023a0aa96e4fb2a67c0d82) - see #196
I reviewed the BoringSSL codebase (and compared our 25519 implementation), and they have:
const uint8_t k25519Precomp[32][8][3][32]
. we may want to measure performance, and size if we're keen on performance - see #196 https://github.com/hannesm/mirage-crypto/tree/25519-big-table (looks like it's not an enhancement on all CPUs -- a slowdown on AMD Ryzen 7 3700X (3.6GHz) and AMD Ryzen 9 7950X (4.5 GHz) by 0.8, a speedup on i7-5600U, 2.60GHz by 1.7)