Closed timgit closed 6 years ago
Thanks Tim. I just took a quick look over the changelog β at a high level sounds great. I'll take a more detailed look through this week.
@timgit I just saw that you changed Switched jsonb type in job table to json
. Any particular reason for this?
@timgit I've taken a look through and converted our app in dev to use pgboss v3. Ticks a lot of boxes now and the failed retryBackoff is a nice bonus! We'll ship it once the database migrations are written. Questions/thoughts below:
::jsonb
. I'm sure you had a reason though, so keen to understand your perspectiveQuestions
someJobName__state__completed__state__completed
records expected to appear in the jobs table after an already completed job has exceeded the expiration time? (I think this happens when the someJobName__state__completed
job records themselves expire)
UPDATE: This was caused by my onComplete handlers not returning a resolved promise, although the same seems to happen if a onComplete handler failssomeJobName__state__completed
job is picked up and run every newJobCheckInterval
. This means if newJobCheckInterval: 1
, and if I log 50 jobs that quickly complete, it will take at minimum 50 seconds to process the onComplete
handlers. Is it possible to configure this?Issues
retryBackoff: true
and retryLimit: 0
is set, it will still retry it once (as if retryLimit
was set to 1
)boss.start()
won't resolve until 1 round of archive/purge watchers have completed (a full table scan of archive would be required to check if there's anything to delete, especially if there isn't an index on archivedOn
).Hope the feedback helps! π
Tim, thanks for the feedback. π
I had an issue opened (#53) questioning the usage of jsonb since pg-boss doesn't doesn't require it. An advantage is what you mention: you can run arbitrary queries against the job table. That may also be considered a disadvantage if the querying load interferes with queue operations. Although, if you're going to go through the trouble of casting..... it reallly defeats that purpose. In summary, I don't have strong feelings about it. I don't personally take advantage of querying the data column via jsonb, but I originally built it that way just in case because "maybe someone would want to do that". So, you would be that someone, and I would argue that your needs would probably outweigh the needs of those who would like milk every ounce of performance gain acquired by switching over to json instead.
Noted. That's not intended. Good catch
That's not intended. You should be able to use the exact same config that subscribe()
allows, batching and all. Are you using teamSize
or batchSize
?
Good idea. I think I'll also add in the timestamps as well
retryBackoff
defaultWhat's your use case for setting retryBackoff
to true but then setting retryLimit
to 0? That combination of options isn't valid. I decided to set the retryLimit
to 1 if the backoff option was set just to simplify the config.
start()
blocking on initial housekeepingI think you're right that housekeeping operations should not block the initial promise resolution on start. I'll switch this over to async. Good catch on the archive table and the lack of indexes as well.
Thanks Tim! (it continues to feel like I'm speaking to myself) π
...jsonb...
At this stage we don't query the queue tables within our app, but occasionally do manually when looking into an issue. Looks like the performance cost of jsonb is minimal though.
That's not intended. You should be able to use the exact same config that subscribe() allows, batching and all. Are you using teamSize or batchSize?
Currently we're using teamSize
, as it meant minimal changes at this stage to switch from v2 to v3. Eventually we'll switch to batch. We're passing teamSize
into subscribe(...)
, however we weren't passing any configuration into onComplete β so it looks like it might be my mistake π€¦ββοΈ β I'll confirm this.
retryBackoff default
This came up because we exposed retry
to be configured via an environment variable so we can adjust it, however regardless of it's value we wanted the retryBackoff
option enforced. It's not likely we'll ever have 0 retries configured, so this isn't important. IΒ thought I'd raise it just in case it wasn't intentional.
Some other questions/thoughts I've had since:
Consider the situation where something goes wrong and some jobs fail completely, exceeding the retry limit, and have since moved into the archive. What's the easiest way once the larger problem is resolved to re-run those jobs? I guess it's worth mentioning as well that all completed jobs are treated equal. If successful jobs were archived more frequently than failed that could make it easier to re-run them again (manually), by updating their state
value.
response
Currently if a promise is rejected due to an error raised (e.g. throw new Error("message")
), this data isn't properly stored in the queue's completion data. Currently we're working around this in our handler wrapper (which also does some other things, such as New Relic instrumentation), by having our own .catch(...)
, serializing if instanceof Error
, and then re-rejecting it.
We aren't doing this yet, but intend on using their API + the monitor-states
event to report the state of the queue for monitoring/alerting/dashboards etc. I thought I'd ask if you've done the same and if you have any tips or guidance here.
Re-running jobs that have failed
Perhaps a republish(id)
that attempts to find a completed job in either the job or the archive table?
Capturing job errors in
response
I'll take a look at this. You're wanting the stack as well I assume.
Reporting queue performance into AWS CloudWatch
I think you're hunting for ideas around "how do I know when things aren't healthy", and I guess this would need to be a combination of "what queue is this" and some sort of trend analysis to determine if things are improving vs. getting worse. I don't currently have any interesting metrics or heuristics to share in this regard, but I'm interested in what you come up with .
@timgit Just discovered pg-boss and i love it. Tried out version 3 and was wondering if it's possible to have monitor-states pass back the data? Its a great overall status but i'd like to display on a web page what is in the queue.
This is one advantage of having the queue as a table. Feel free to issue arbitrary queries against both the job and archive tables. Use your best judgment to decide how many queries to run against it, however, as read activity will have some impact on performance.
@timgit Thanks! To do that would it be best to create my own pool and pass that to the pg-boss instance and use it to query what i need? If so do you have an example of doing this?
Also, not sure if this is intended but should the singletonKey example below enter 123 into the database or the whole object? Right now i noticed it enters {singltonKey:'123'} into the database. boss.publish('my-job', {}, {singletonKey: '123'}) // resolves a jobId
@jr14marquez, you can just use the pg module directly. pg-boss doesn't have an arbitrary query api.
I'm not sure what you mean by singletonKey. There's a text column in the job queue table specifically for this value, if that helps.
@timclipsham I just published 3.0.0-beta4 which should address most the major issues with the last beta that you pointed out.
I also added a migration to this release, so it's kind of a RC in that regard.
@timgit My mistake on the singleonKey. Misunderstood. Also, instead of using pg-module i just used boss.db.executeSql(query) to get what i needed out of the database. That worked perfectly.
Hey there! I just published a beta for 3.0.0 to npm and I'd like your feedback here or even on the PR #78 if you have specific concerns.
I haven't written a migration for this version yet, so don't start this up against any existing instances, as things will likely not go very well. π
See the change log for details.