quinnj / JSON3.jl

Other
214 stars 47 forks source link

Unable to write negative big integers #242

Closed Teo-ShaoWei closed 1 year ago

Teo-ShaoWei commented 1 year ago

issue faced

I was trying to save my current random state when I reach an error, e.g. with JSON3.write(Random.MersenneTwister(3)).

I narrowed it down to the negative big integers, e.g.

julia> JSON3.write(BigInt(-4))
ERROR: MethodError: no method matching Unsigned(::BigInt)

In this part of the code: https://github.com/quinnj/JSON3.jl/blob/45e5494295076e044f402fed6c2f8b9edf7cdbd9/src/write.jl#L198-L211

we use Base.split_sign, which fails for negative integers.

possible resolutions

I think there's 4 places in the code where we can do things. However, it seems to me that option 4 seems the most viable to me. I'll list them out for discussion with @quinnj and the rest of us first in case there's something better. Will raise a PR in a while if nothing comes up.

1. add unsigned definitions for BigInt

Add both the following:

unsigned(x::BigInt) = x
Unsigned(x::BigInt) = x

into gmp.jl.

My guess is this is likely a no go with the Julia maintainers as there are invariance assumptions that Julia devs will make, e.g. unsigned(x) <: Unsigned. Breaking this for BigInt within the core might potentially break some other code somewhere else, opening up a big can of worms.

2. (dis)patch split_sign

Seems like this is a private local function used to support string(n::Integer) inside intfuncs.jl. Each of the specialized converter, for our case being dec, seems to be using bit ops, so they need the unsigned assumption.

While patching it doesn't seem viable, it is possible to fix the issue by dispatching

Base.split_sign(x::BigInt) = abs(x), x < 0

nothing seems to fail now, but similarly might open up the code to future regression...

3. make use of the defined string to write integers

BigInt do have its string definition inside gmp.jl already. Previously I recall there might be some issue with the writing of unsigned etc, but the string of any built-in integer seems to print JSON-friendly formats now. Since JSON3.jl supports Julia v1.6 onwards, changing JSON3.write to use Base.string might be the cleanest in removing all the customization.

The big con is the potential slowdown in JSON writing speed, as we will create intermediate Base.StringVectors before writing them into JSON3.jl's buf. Not sure whether any experiment has been done on this? Intuitively the creation of intermediates when computing each string should add up significantly...

4. do away with split_sign

Since Base.split_sign look like it is meant for that narrow private usage inside Julia core, we can change our JSON3.write to:

function write(::NumberType, buf, pos, len, y::Integer; kw...)
    x = abs(y)
    if y < 0
        @writechar UInt8('-')
    end
    n = i = ndigits(x, base=10, pad=1)
    @check i
    while i > 0
        @inbounds buf[pos + i - 1] = 48 + rem(x, 10)
        x = oftype(x, div(x, 10))
        i -= 1
    end
    return buf, pos + n, len
end

Since this argument doesn't use bit ops and is correct for any positive integers regardless of type as long as they support the interface functions like abs, isless (which they should if split_sign is to work), I think it would be ok to keep typeof(x) === typeof(y) here.

quinnj commented 1 year ago

Thanks for the detailed report @Teo-ShaoWei! Sorry for the slow response; and I probably should have just let you do the fixing here! But I ended up going with your 2nd option here: https://github.com/quinnj/JSON3.jl/pull/244.

Teo-ShaoWei commented 1 year ago

Oooh nice thanks @quinnj! Haha no worries, as long as it got fixed 💪 I was trying to list the options also because I thought you might reach a different preference. Kudos much! 🎉