neume-network / extraction-worker

Async worker_thread component for maximalizing concurrent data retrieval and processing.
GNU General Public License v3.0
3 stars 5 forks source link

It's difficult to understand if a very long timeout in `fetch` could stall the extraction worker's progress #23

Closed TimDaub closed 2 years ago

TimDaub commented 2 years ago

/cc @il3ven

TimDaub commented 2 years ago

Debugging journal:

2022-06-29T09:44:40.703Z neume-network-extraction-worker:worker {"successRate":0.9954193093727978,"peak":29703,"average":45504.534531360114,"total":5676}
TimDaub commented 2 years ago

Found that there's potentially a problem with better-queue's maxTimeout: https://github.com/diamondio/better-queue/issues/81

TimDaub commented 2 years ago
il3ven commented 2 years ago

I did an experiment. I pushed 6 tasks to the queue. The second task should take a very long time. I found that the second task did not stall the queue if the concurrency was greater than 1.

The above makes sense. We can imagine it like this. With concurrency equal to two we have two workers that can execute our tasks in parallel. If one of the worker gets blocked due to a long task the other worker can keep on executing the tasks.

https://user-images.githubusercontent.com/4337699/177052097-f60d7970-a323-42f1-a8ec-89e651b297e2.mov

TimDaub commented 2 years ago

If one of the worker gets blocked due to a long task the other worker can keep on executing the tasks.

yes, but I'm outlining the problem where we potentially have a concurrency of e.g. 200 parallel workers and then over time while all non-problematic tasks aren't blocking the queue, there are a total of > 200 tasks that can clog up the queue. Think about it this way: We have 20000 tasks to execute but only 200 tasks that take e.g. 5mins to clear, then if those 200 bad tasks are spread over those 20000 good tasks, we have a good chance that the queue is clogged up and not running at full concurrency all the time. Hence further allowing to configure timeouts to more efficiently ending uneconomic tasks can be a good thing.