This fixes #101. Loading text and indexing reports is split off into a worker process, which is managed using the worker-farm module. If the worker process crashes during GC, a new process will be started, and the job will be retried. Also, the worker process is stopped and started every 1000 documents, to limit overall memory usage.
Since the worker processes don't inherit command line arguments, I made some additional changes to avoid loading the config file from the worker processes, and I instead pass the config object from the parent process to the worker process.
This fixes #101. Loading text and indexing reports is split off into a worker process, which is managed using the
worker-farm
module. If the worker process crashes during GC, a new process will be started, and the job will be retried. Also, the worker process is stopped and started every 1000 documents, to limit overall memory usage.Since the worker processes don't inherit command line arguments, I made some additional changes to avoid loading the config file from the worker processes, and I instead pass the config object from the parent process to the worker process.