Cryptographic Addition Chain Generation in Go
addchain
generates short addition chains for exponents of cryptographic
interest with results rivaling the best hand-optimized chains.
Intended as a building block in elliptic curve or other cryptographic code
generators.
An addition chain for a target integer n is a sequence of numbers starting at 1 and ending at n such that every term is a sum of two numbers appearing earlier in the sequence. For example, an addition chain for 29 is
1, 2, 4, 8, 9, 17, 25, 29
Addition chains arise in the optimization of exponentiation algorithms with
fixed exponents. For example, the addition chain above corresponds to the
following sequence of multiplications to compute x29
x2 = x1 * x1 x4 = x2 * x2 x8 = x4 * x4 x9 = x1 * x8 x17 = x8 * x9 x25 = x8 * x17 x29 = x4 * x25
An exponentiation algorithm for a fixed exponent n reduces to finding a
minimal length addition chain for n. This is especially relevent in
cryptography where exponentiation by huge fixed exponents forms a
performance-critical component of finite-field arithmetic. In particular,
constant-time inversion modulo a prime p is performed by computing
xp-2 (mod p)
, thanks to Fermat's Little
Theorem. Square root
also reduces to exponentiation for some prime moduli. Finding short addition
chains for these exponents is one important part of high-performance finite
field implementations required for elliptic curve cryptography or RSA.
Minimal addition chain search is famously hard. No practical optimal
algorithm is known, especially for cryptographic exponents of size 256-bits
and up. Given its importance for the performance of cryptographic
implementations, implementers devote significant effort to hand-tune addition
chains. The goal of the addchain
project is to match or exceed the best
hand-optimized addition chains using entirely automated approaches, building
on extensive academic research and applying new tweaks that exploit the
unique nature of cryptographic exponents.
The following table shows the results of the addchain
library on popular
cryptographic exponents. For each one we also show the length of the best
known hand-optimized addition chain, and the
delta from the library result.
Name | This Library | Best Known | Delta |
---|---|---|---|
Curve25519 Field Inversion | 266 | 265 | +1 |
NIST P-256 Field Inversion | 266 | 266 | +0 |
NIST P-384 Field Inversion | 397 | 396 | +1 |
secp256k1 (Bitcoin) Field Inversion | 269 | 269 | +0 |
Curve25519 Scalar Inversion | 283 | 284 | -1 |
NIST P-256 Scalar Inversion | 294 | 292 | +2 |
NIST P-384 Scalar Inversion | 434 | 433 | +1 |
secp256k1 (Bitcoin) Scalar Inversion | 293 | 290 | +3 |
See full results listing for more detail and results for less common exponents.
These results demonstrate that addchain
is competitive with hand-optimized
chains, often with equivalent or better performance. Even when addchain
is
slightly sub-optimal, it can still be considered valuable since it fully
automates a laborious manual process. As such, addchain
can be trusted to
produce high quality results in an automated code generation tool.
Install a pre-compiled release binary:
curl -sSfL https://git.io/addchain | sh -s -- -b /usr/local/bin
Alternatively build from source:
go install github.com/mmcloughlin/addchain/cmd/addchain@latest
Search for a curve25519 field inversion addition chain with:
addchain search '2^255 - 19 - 2'
Output:
addchain: expr: "2^255 - 19 - 2"
addchain: hex: 7fffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffeb
addchain: dec: 57896044618658097711785492504343953926634992332820282019728792003956564819947
addchain: best: opt(runs(continued_fractions(dichotomic)))
addchain: cost: 266
_10 = 2*1
_11 = 1 + _10
_1100 = _11 << 2
_1111 = _11 + _1100
_11110000 = _1111 << 4
_11111111 = _1111 + _11110000
x10 = _11111111 << 2 + _11
x20 = x10 << 10 + x10
x30 = x20 << 10 + x10
x60 = x30 << 30 + x30
x120 = x60 << 60 + x60
x240 = x120 << 120 + x120
x250 = x240 << 10 + x10
return (x250 << 2 + 1) << 3 + _11
Next, you can generate code from this addition chain.
Install:
go get -u github.com/mmcloughlin/addchain
Algorithms all conform to the alg.ChainAlgorithm
or
alg.SequenceAlgorithm
interfaces and can be used directly. However the
most user-friendly method uses the alg/ensemble
package to
instantiate a sensible default set of algorithms and the alg/exec
helper to execute them in parallel. The following code uses this method to
find an addition chain for curve25519 field inversion:
func Example() {
// Target number: 2²⁵⁵ - 21.
n := new(big.Int)
n.SetString("7fffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffeb", 16)
// Default ensemble of algorithms.
algorithms := ensemble.Ensemble()
// Use parallel executor.
ex := exec.NewParallel()
results := ex.Execute(n, algorithms)
// Output best result.
best := 0
for i, r := range results {
if r.Err != nil {
log.Fatal(r.Err)
}
if len(results[i].Program) < len(results[best].Program) {
best = i
}
}
r := results[best]
fmt.Printf("best: %d\n", len(r.Program))
fmt.Printf("algorithm: %s\n", r.Algorithm)
// Output:
// best: 266
// algorithm: opt(runs(continued_fractions(dichotomic)))
}
This section summarizes the algorithms implemented by addchain
along with
references to primary literature. See the bibliography
for the complete references list.
The alg/binary
package implements the addition chain equivalent
of the basic square-and-multiply exponentiation
method. It is
included for completeness, but is almost always outperformed by more advanced
algorithms below.
The alg/contfrac
package implements the continued fractions
methods for addition sequence search introduced by
Bergeron-Berstel-Brlek-Duboc in 1989 and later extended. This approach
utilizes a decomposition of an addition chain akin to continued fractions,
namely
(1,..., k,..., n) = (1,...,n mod k,..., k) ⊗ (1,..., n/k) ⊕ (n mod k).
for certain special operators ⊗ and ⊕. This
decomposition lends itself to a recursive algorithm for efficient addition
sequence search, with results dependent on the strategy for choosing the
auxillary integer k. The alg/contfrac
package provides a
laundry list of strategies from the literature: binary, co-binary,
dichotomic, dyadic, fermat, square-root and total.
Bos and Coster described an iterative algorithm for efficient addition
sequence generation in which at each step a heuristic proposes new numbers
for the sequence in such a way that the maximum number always decreases.
The original Bos-Coster paper defined four
heuristics: Approximation, Divison, Halving and Lucas. Package
alg/heuristic
implements a variation on these heuristics:
Divison and Lucas are not implemented due to disparities in the literature about their precise definition and poor results from early experiments. Furthermore, this library does not apply weights to the heuristics as suggested in the paper, rather it simply uses the first that applies. However both of these remain possible avenues for improvement.
Dictionary methods decompose the binary representation of a target integer n into a set of dictionary terms, such that n may be written as a sum
n = ∑ 2ei di
for exponents e and elements d from a dictionary D. Given such a decomposition we can construct an addition chain for n by
The efficiency of this approach boils down to the decomposition method. The alg/dict
package provides:
The runs algorithm is a custom variant of the dictionary approach that
decomposes a target into runs of ones. It leverages the observation that
building a dictionary consisting of runs of 1s of lengths
l1, l2, ..., lk
can itself be
reduced to:
li
. As with dictionary approaches we can use
Bos-Coster heuristics and continued fractions here. However here we have the
advantage that the li
are typically very small,
meaning that a wider range of algorithms can be brought to bear.li
to build an addition sequence for the runs themselves
r(li)
where r(e) = 2e-1
. See
dict.RunsChain
.This approach has proved highly effective against cryptographic exponents which frequently exhibit binary structure, such as those derived from Solinas primes.
I have not seen this method discussed in the literature. Please help me find references to prior art if you know any.
Close inspection of addition chains produced by other algorithms revealed
cases of redundant computation. This motivated a final optimization pass over
addition chains to remove unecessary steps. The alg/opt
package
implements the following optimization:
These micro-optimizations were vital in closing the gap between addchain
's
automated approaches and hand-optimized chains. This technique is reminiscent
of basic passes in optimizing compilers, raising the question of whether
other compiler optimizations could apply to addition
chains?
I have not seen this method discussed in the literature. Please help me find references to prior art if you know any.
If you use addchain
in your research a citation would be appreciated.
Citing a specific release is preferred, since they are archived on
Zenodo and assigned a DOI. Please use the
following BibTeX to cite the most recent 0.4.0
release.
@misc{addchain,
title = {addchain: Cryptographic Addition Chain Generation in Go},
author = {Michael B. McLoughlin},
year = 2021,
month = oct,
howpublished = {Repository \url{https://github.com/mmcloughlin/addchain}},
version = {0.4.0},
license = {BSD 3-Clause License},
doi = {10.5281/zenodo.5622943},
url = {https://doi.org/10.5281/zenodo.5622943},
}
If you need to cite a currently unreleased version please consider filing an issue to request a new release, or to discuss an appropriate format for the citation.
Thank you to Tom Dean, Riad Wahby, Brian Smith and str4d for advice and encouragement. Thanks also to Damian Gryski and Martin Glancy for review.
Contributions to addchain
are welcome:
addchain
is available under the BSD 3-Clause License.