timgit / pg-boss

Queueing jobs in Postgres from Node.js like a boss
MIT License
1.95k stars 153 forks source link

Feature request: bulkSend #310

Closed tcoats closed 2 weeks ago

tcoats commented 2 years ago

We're implementing boss in a high throughput scenario where performance is essential. We use batchSize and teamSize to efficiently process multiple messages, however the send process is a single message at a time. We can execute manual sql inserts however we lose singleton functionality.

Can we implement createJob in such a way that it supports bulk operations? This may require more logic in sql to handle features like nextSlot seamlessly.

timgit commented 2 years ago

Which singleton use case is the most important? If it were time-based throttling, for example, you could squash these pre-insert into time buckets via functions like date_trunc() in postgres if you're referring to using direct table inserts in sql.

Also, what is your average batch size for insertion? Just thinking about how much of benefit we'd get from optimizing writes.

tcoats commented 2 years ago

We're syncing product data to 37 separate systems. If 5 products change we're inserting 185 jobs. Sometimes 4000 products change.

Time based singletons are not important for our use cases, we'd like to de-dupe repeat changes and only execute the last one. We could write an insert script for that style of bulk insert. It could also handle the upsert functionality we'd like, where the last job's data is the one we'd like on the queue.

Can certainly operate outside the library.

timgit commented 2 years ago

Have you tried boss.insert([jobs])? I realize it won't be able to offer all the features that boss.send() offers in regards to things like debouncing, but it may resolve at least some of your concerns with batching.