In split_reach/extracter/extract_refs_task.py we set pool_map = map for use in yield_structured_references. However if we utilise Pool from multiprocessing i.e.
pool = Pool(num_workers)
pool_map = pool.map
we could speed up this task.
However in the past we found using num_workers>1 actually slowed things down.
So worth investigating how this behaves now to see whether it's worth implementing.
In
split_reach/extracter/extract_refs_task.py
we setpool_map = map
for use inyield_structured_references
. However if we utilise Pool frommultiprocessing
i.e.we could speed up this task. However in the past we found using num_workers>1 actually slowed things down. So worth investigating how this behaves now to see whether it's worth implementing.