Hi, I would like to know how can I configure news-please options to optimize the crawling and the extraction process. For example, let's assume that we have a machine with 4 CPUs (2 threads per CPU) and we have 20 websites to crawl from, what is the optimal number of number_of_parallel_daemons, number_of_parallel_crawlers and CONCURRENT_REQUESTS_PER_DOMAIN.
Versions (please complete the following information):
OS: debian 20.04
Python Version : python3.8
news-please Version latest
Intent (optional; we'll use this info to prioritize upcoming tasks to work on)
Hi, I would like to know how can I configure news-please options to optimize the crawling and the extraction process. For example, let's assume that we have a machine with 4 CPUs (2 threads per CPU) and we have 20 websites to crawl from, what is the optimal number of
number_of_parallel_daemons
,number_of_parallel_crawlers
andCONCURRENT_REQUESTS_PER_DOMAIN
.Versions (please complete the following information):
Intent (optional; we'll use this info to prioritize upcoming tasks to work on)