Convert-Group / pairing-functions

A collection of pairing functions.
https://pypi.org/project/pairing-functions
MIT License
30 stars 4 forks source link

Doesn't work for large integers #3

Open kuco23 opened 2 years ago

kuco23 commented 2 years ago

This is probably an issue with the the numerical calculation of the square root:

>>> from pairing_functions.cantor import pair, unpair
>>> unpair(pair(129315199267255490, 392198719615119))
>>> (129707397986870608, 0)

I'm wondering if the integer-root can be efficiently calculated non-numerically.

kuco23 commented 2 years ago

I've read a bit about this here and came across Newton's method. Implementation for Cantor pairing seems to work

# from math import isqrt
def isqrt(n):
    x = n
    y = (x + 1) // 2
    while y < x:
        x = y
        y = (x + n // x) // 2
    return x

def pair(k1, k2):
    return (k1 + k2) * (k1 + k2 + 1) // 2 + k2

def unpair(z):
    a = 8 * z + 1
    # rewrite the pairing thing as sqrt((8z + 1)/4) - 1/2
    # floor(sqrt(a/4) - 1/2) is in [floor(a/4), floor(a/4)-1]
    # depending on whether sqrt(a/4) has the first decimal number < 5
    m = isqrt(100 * a // 4)
    w = m // 10

    if m % 10 < 5:
        w -= 1

    t = (w * w + w) // 2
    y = z - t
    x = w - y
    return x, y

The pairing thing refers to w inside the calculation of unpair from here

ilias-ant commented 2 years ago

Hello @kuco23!

Thank you for raising this (interesting) behavior! We will try to find some time to review it more thoroughly in the next couple of weeks.

But, since you've already done the majority of the mental work, it would be even better if you would like to submit a Pull Request on this and discuss it further there =)

p.s. If you have any other feedback as a package user, always glad to hear it!

kuco23 commented 2 years ago

Maybe it's not that relevant, but there exists Gödel numbering, which pairs any sequence, while also encoding its length (so, when unpairing you don't need to specify it). In a way, it's a generalization of pairing. It's pretty useless, though somewhat mathematically relevant.

But the same functionality can be implemented efficiently by e.g. pairing (1,2,3) to 1|2|3 and then interpreting | as an extra symbol in the base-11 system. Or interpreting each 1,2 and 3 in binary, then separating those by the symbol 2 (and maybe then interpreting that in trinary system, as only 3 symbols are ever used).

Just brainstorming here. I needed some of those things in relation to a cryptography project.