riverqueue / river

Fast and reliable background jobs in Go
https://riverqueue.com
Mozilla Public License 2.0
3.34k stars 89 forks source link

Elector interval field value out of range #545

Closed gottafixthat closed 3 weeks ago

gottafixthat commented 3 weeks ago

I was evaluating riverqueue for use in one of my projects and after attempting to run it for the first time, I'm getting the following error:

time=2024-08-21T19:45:12.744Z level=ERROR msg="Elector: Error attempting to elect" client_id=5e589f722171_2024_08_21T19_45_11_805369 err="pq: interval field value out of range: \"15000000000\"" num_errors=2 sleep_duration=1.917471608s

Postgres logs:

2024-08-21 19:07:10.854 UTC [70] ERROR: interval field value out of range: "15000000000" 2024-08-21 19:07:10.854 UTC [70] CONTEXT: unnamed portal parameter $2 = '...' 2024-08-21 19:07:10.854 UTC [70] STATEMENT: -- name: LeaderAttemptElect :execrows INSERT INTO river_leader(leader_id, elected_at, expires_at) VALUES ($1, now(), now() + $2::interval) ON CONFLICT (name) DO NOTHING

I have validated that my migrations worked correctly.

Postgres v14.6 (I'm locked to this version at the moment) Go v1.23.0 River v0.11.4

Any suggestions?

bgentry commented 3 weeks ago

Seems likely you’re using the dbsql driver? This appears to be a bug with that driver not handling duration types as db interval types correctly.

gottafixthat commented 3 weeks ago

I tried both, actually. I thought it might have been the dbsql driver at first so I refactored to use riverpgxv5 and it was the same result.

bgentry commented 3 weeks ago

Hmm, you must have at least gotten a different error when using pgx? The message above is specific to pq which isn’t used at all by the pgx driver.

gottafixthat commented 3 weeks ago

I'll verify this tomorrow morning. It's entirely possible that I'm mixing things in the wrong way.

brandur commented 3 weeks ago

It occurred to me while looking at this that 15000000000 is the elector's default 15 second interval in nanoseconds, so 15 * 1_000_000_000.

You can see here that when pgx encodes a duration, it encodes it as microseconds, which is the maximum precision that Postgres will accept:

https://github.com/jackc/pgx/blob/4f7e19d67df4d411ac1c19373390021a2f23aa07/pgtype/builtin_wrappers.go#L498-L514

For some reason your pgx is trying to write nanoseconds instead of microseconds. Do you have a custom encoding override somewhere?

gottafixthat commented 3 weeks ago

So it turns out I hadn't had enough coffee when I was testing this. I had set things up twice in different packages, and had forgotten to update the consumer's connection. Once I updated the consumer to correctly use the riverpgxv5 connector everything is working.

Thanks for your quick responses, RiverQueue is a very clean queue system to configure and use.

bgentry commented 3 weeks ago

@brandur hmm, wondering if there still might be a dbsql-specific issue here given that the "fix" was to use the pgx driver 🤔

brandur commented 2 weeks ago

It may be that River doesn't function when using lib/pq. I'm going to wait and see if we get anymore bug reports about that before actioning though — it's going to be a yak shave, and lib/pq's been deprecated for so long that I suspect a lot of people have since dropped it.