Podcastindex-org / aggregator

Code, docs and discussion related to the Aggregator.
MIT License
16 stars 10 forks source link

Optimization of parallel http requests #4

Open dellagustin opened 3 years ago

dellagustin commented 3 years ago

Hi @daveajones , this is a follow up of the mastodon thread https://podcastindex.social/@dave/105212132065637683

As a software developer by trade I get a bit nervous when there is no traceability from code to the discussions about the code, so I created this issue to link the repo with the discussion thread.

Some pointers from the code as it currently is:

Without a deep(er) analysis, I suspect that the having too many parallel threads (used for the io on the http requests) may cause some sockets to be inactive for more then the timeout of 30s, lowering .

I found this recommendation to set the threadpool size to the number of logical cores avaialable to the machine: https://dev.to/johnjardincodes/increase-node-js-performance-with-libuv-thread-pool-5h10#:~:text=The%20recommendation%20is%20to%20set,actually%20result%20in%20poorer%20performance.

The recommendation is to set the UV_THREADPOOL_SIZE to the number of logical cores your machine is running. In my case I will set the thread pool size to 12.

It makes no sense setting the size to anything more than the logical cores your hardware is running and could actually result in poorer performance.

Here is an interesting blog about handing asynchronous operations in parallel - https://itnext.io/node-js-handling-asynchronous-operations-in-parallel-69679dfae3fc

In my podcast aggregator, which is a client (browser) side one, I limit the number of parallel open http requests - https://github.com/podStation/podStation/blob/2bc006af81c61dac69e20f2d83b6c9b3986b9ad7/extension/background/entities/podcastManager.js#L129