contribsys / faktory

Language-agnostic persistent background job server
https://contribsys.com/faktory/
Other
5.71k stars 228 forks source link

Int64 Job Args Lose Precision #395

Open TheRedPanda17 opened 2 years ago

TheRedPanda17 commented 2 years ago

I am experiencing an issue when passing large int64 arguments to a job. I try passing 3889108265204450030 for example, and end up receiving 3889108265204450000.

After looking into it, I think this is caused by using the default json decoder when unmarshaling a job (I believe it's here: https://github.com/contribsys/faktory/blob/main/client/client.go#L345). In order to keep precision on int64, it would have to be something like this instead:

var job Job
decoder := json.NewDecoder(bytes.NewBuffer(data))
decoder.UseNumber()
decoder.Decode(&job)

Is this a change you would be interested in? I plan to work around this by encoding and decoding it myself and passing faktory a string, but it might be a helpful thing to add.

I am using Faktory v1.6.0 with faktory_worker_go.

mperham commented 2 years ago

I’m pretty sure this would break a lot of apps. Maybe for Faktory 2.0.

mperham commented 2 years ago

JSON is JavaScript, with all of its limitations. Notably integers are limited to 53-bits because in JS all numbers are Float and a 64-bit IEEE float value can only store a 53-bit integer.

If you send args like { "args": [3889108265204450030] }, Faktory will pass your arg as a truncated float64. If I make this change, it will pass it as a json.Number and the app code will need to call Float64() or Int64() to get that same value. It uses a string value to send the actual value so there's no truncation.

But that's a breaking change for all Faktory workers. The easiest thing to do is have your app code send and receive string values directly if you know you are going to using large integer arguments.

TheRedPanda17 commented 4 months ago

I ran into this again. Would you be open to a changed wrapped in a config that would allow people to use the json.Number? It feels weird to continually turn all int64s into strings in every single client I write that has bigints. It would be nice to fix things at the source.

mperham commented 4 months ago

@TheRedPanda17 My POV is that Faktory is meant to be polyglot. json.Number is Go-specific afaik. How do you think we should handle that tension?

TheRedPanda17 commented 4 months ago

I understand that json.Number is Go specific. It's more a means to an end, the end being that we don't want int64s to lose precision. No language that supports int64 will fully work as it is currently implemented. Is there way to be language agnostic but still support this primitive data type? We've ran into this issue at our organization with two projects now, one in Python and one in Go. Many of our services use int64s for IDs, and it feels weird that we have to handle serialization ourselves in every service that uses Faktory when it could be fixed here.

mperham commented 4 months ago

More context: https://www.sobyte.net/post/2022-01/golang-json/

mperham commented 4 months ago

According to that blog post, Go does support 64-bit ints in its JSON output, but other parsers might truncate numbers to 53-bits to keep perfect compatibility with browser JS engines.

mperham commented 4 months ago

I've also verified that if you send a huge number, Go will unmarshal all 64-bits.

package main

import (
        "encoding/json"
        "fmt"
        "math"
)

func b2s(b []byte, e error) (string, error) {
        return string(b), e
}

type S struct {
        N int64
        Foo int
}

func main() {
        var i int64 = math.MaxInt64
        var j int = math.MaxInt
        fmt.Println(b2s(json.Marshal(i)))

        strct := S{N: i, Foo: j}
        fmt.Printf("%+v\n", strct)
        bytes, err := json.Marshal(strct)
        fmt.Println(b2s(bytes, err))

        var result S
        err = json.Unmarshal(bytes, &result)
        fmt.Printf("%+v\n", result)
        fmt.Println(err)
}

Roundtrip:

% go run testnum.go
9223372036854775807 <nil>
{N:9223372036854775807 Foo:9223372036854775807}
{"N":9223372036854775807,"Foo":9223372036854775807} <nil>
{N:9223372036854775807 Foo:9223372036854775807}
<nil>
mperham commented 4 months ago

I would look to see if your client-side JSON library has an option to support full-sized numbers. It seems like your library is truncating the numbers to be 53-bit compatible.

TheRedPanda17 commented 4 months ago

The Python client we are using is https://github.com/cdrx/faktory_worker_python.

It just uses the default json.dumps function which does not lose precision: https://github.com/cdrx/faktory_worker_python/blob/master/faktory/_proto.py#L320.

I put breakpoints in the code and ran faktory locally to verify that our payload is correct before we send it here: https://github.com/cdrx/faktory_worker_python/blob/master/faktory/_proto.py#L323

And that the payload has lost precision by the time it comes back from faktory here: https://github.com/cdrx/faktory_worker_python/blob/master/faktory/_proto.py#L305

Unless I'm missing some step in between, this would seem to indicate the faktory server is deserializing it and causing a loss of precision. I am almost certain this is the case because when I serialize it before passing it to the client so that Faktory treats the whole thing as a string, no precision is lost.

That's the issue we want to solve. We don't want to pass payloads to faktory and receive them back altered to be incorrect. I'd be happy to change what we're doing if we're doing something wrong, but it seems like this is a bug in Faktory, not an issue of being language agnostic, etc.

mperham commented 4 months ago

The issue is that Go's JSON parser unmarshals numbers as Float64 because it doesn't know or try to be smart. When it fills the Args array, the Args are typed as interface{} so Go doesn't know their proper type. When it marshals the Args again to Redis, it marshals the array elements as Floats.

When Go marshals 9223372036854775807 as a Float, it emits 9223372036854776000.

In-memory: {F:9.223372036854776e+18 N:9223372036854775807 Foo:9223372036854775807 Array:[9223372036854775807 9223372036854775807]} Marshalled: {"F":9223372036854776000,"N":9223372036854775807,"Foo":9223372036854775807,"Array":[9223372036854775807,9223372036854775807]}

I don't know how to fix this but I will think about it.

TheRedPanda17 commented 4 months ago

Thank you for looking into it! Can we re-open the issue for visibility?

mperham commented 4 months ago

I'm trying to verify that I've fixed the issue but having trouble even reproducing this. With both main and v1.6.0 I can push 1152921504606846976 and fetch gives me 1152921504606846976.000000 from the round trip in the Go client. Admittedly there's an unwanted int -> float conversion but that's well documented Go/JSON behavior. I'm not sure what to do if I can't reproduce the behavior in my Go test suite.

client.NewJob("MikeJob", 0.0000001, 1152921504606846976)