Closed martin-huber closed 5 years ago
Merged, and also now part of the latest snapshot of Collector Core as well as HTTP and Filesystem Collectors.
Many thanks for your contribution!
Many thanks to you! In the next days another change request (concerning SSL) will follow ...
Cheers, Martin
------ Originalnachricht ------ Von: "Pascal Essiembre" notifications@github.com An: "Norconex/collector-core" collector-core@noreply.github.com Cc: "Martin Huber" martin.huber@gmx.de; "Author" author@noreply.github.com Gesendet: 12.03.2019 05:37:37 Betreff: Re: [Norconex/collector-core] Introduce "maxParallelCrawlers" option in collector-config (#25)
Merged, and also now part of the latest snapshot of Collector Core as well as HTTP and Filesystem Collectors.
Many thanks for your contribution!
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Norconex/collector-core/pull/25#issuecomment-471854255, or mute the thread https://github.com/notifications/unsubscribe-auth/AClJn6vxnCgzdXEVjNMmMprJUrCxqO--ks5vVy8RgaJpZM4bhJ7k.
… in order to configure size of threadpool that is used for parallel crawler jobs: pass it from AbstractCollector to already existing 2nd AsyncJobGroup constructor
We are using one collector with over 300 crawlers and are observing OutOfMemoryErrors if all are running in parallel alltogether (what currently is the only option).
Using this configuration option allows for a finer control over the consumed resources. The behaviour is backward compatible: if the option is ommited, it is still all crawlers that are executed in parallel.