sipa / bech32

Code snippets and analysis of the Bech32 format
191 stars 107 forks source link

Canonically encode powers of primitive element e #64

Closed meshcollider closed 2 years ago

meshcollider commented 2 years ago

There is no behaviour change in this PR.

Currently, the primitive element e generating the multiplicative group of GF(1024) is encoded as an integer in GF1024_EXP[] as ({9}, {15}), or 303. Because of this choice, the table is generated in the following way: For an element (v1, v0), we compute e*(v1, v0) = (v1', v0') as:

    v1' = {6}*v1 + {9}*v0
    v0' = {27}*v1 + {15}*v0

(from https://github.com/sipa/ezbase32/blob/master/bech32.cpp#L90-L122)

After asking sipa about this choice of encoding, he couldn't recall why the ({9},{15}) was chosen. It has no effect on the checksum, as we are not modifying the choice of primitive element itself, only the encoding of it in the GF1024_EXP[] table. For clarity, this PR instead uses the clearer choice of ({1},{0}) to encode e. This encoding is simpler because elements a*e + b in GF(1024) are then directly encoded as (a, b). The multiplication by e then becomes:

    v1' = {9}*v1 + v0
    v0' = {23}*v1

Where the {9} and {23} come directly from the choice of modulus for the extension field x^2 + 9*x + 23.

We then update the syndrome computation values accordingly, which just encode powers of two * powers of e.

meshcollider commented 2 years ago

For review, the tables can be generated with the following sage code:

GF1024_LOG = [-1] + [0] * 1023
GF1024_EXP = [1] * 1024
v = 1
for i in range(1, 1023):
    v0 = v & 31
    v1 = v >> 5
    v0n = (F.fetch_int(23)*F.fetch_int(v1)).integer_representation()
    v1n = (F.fetch_int(9)*F.fetch_int(v1) + F.fetch_int(v0)).integer_representation()
    v = v1n << 5 | v0n
    GF1024_EXP[i] = v
    GF1024_LOG[v] = i

And the syndrome constants can be generated using:

for k in range(1, 6):
    for b in [1,2,4,8,16]:
        c0 = GF1024_EXP[(997*k + GF1024_LOG[b]) % 1023]
        c1 = GF1024_EXP[(998*k + GF1024_LOG[b]) % 1023]
        c2 = GF1024_EXP[(999*k + GF1024_LOG[b]) % 1023]
        c = c2 << 20 | c1 << 10 | c0
        print("0x%x" % c)

The canonical encoding of every unit of GF(1024) can be checked with the following code:

for i in range(1023):
    v = GF1024_EXP[i]
    v0 = F.fetch_int(v & 31)
    v1 = F.fetch_int(v >> 5)
    if (e^i).list() != [v0, v1]:
        print("Error:", i)
NeedsAdjustment commented 2 years ago

looks good to me :)

sipa commented 2 years ago

I've recreated the same tables using the following Sage code (which makes use of Sage's ability to do extension field arithmetic):

# Binary field
B.<b> = GF(2)
# Polynomials over the binary field
BP.<bp> = B[]
# Encoder field GF(32)
F.<f> = GF(32, modulus=bp^5 + bp^3 + 1, repr='int')
# Polynomials over the encoder field
FP.<fp> = F[]
# Decoder field GF(1024)
E.<e> = F.extension(modulus=fp^2 + F.fetch_int(9)*fp + F.fetch_int(23))

# Convert an E field element to an integer 0..1023.
def gf_to_int(v):
    return v[0].integer_representation() + 32*v[1].integer_representation()

# Convert an integer 0..1023 to an E field element.
def int_to_gf(v):
    return F.fetch_int(v & 31) + e*F.fetch_int(v >> 5)

GF1024_EXP = [gf_to_int(e**i) for i in range(1024)]
GF1024_LOG = [-1] + [discrete_log(int_to_gf(i), e, ord=1023, operation="*") for i in range(1, 1024)]

print(GF1024_EXP)
print(GF1024_LOG)

a1, a2, a3 = e**997, e**998, e**999
for k in range(1, 6):
    for b in [1,2,4,8,16]:
        c = (gf_to_int(a1**k * int_to_gf(b)) |
             gf_to_int(a2**k * int_to_gf(b)) << 10 |
             gf_to_int(a3**k * int_to_gf(b)) << 20)
        print("0x%08x" % c)
sipa commented 2 years ago

ACK 42b570dbd99c02424157596c45d69cc2a3d54da9