In this approach, each purl URL is in a separate task that retrieves the Cocina and extracts the title (label), image filename, and fund name. I initially throttled the number of active tasks to 5 and retrieving and extracting 817 druids took 22:25 minutes. I will do a second run with active DAG runs set to 10 to see if that improves the performance.
The second run with 10 active tasks and running on Stanford's network brought the time down to 6:41 minutes.
Fixes #1177
In this approach, each purl URL is in a separate task that retrieves the Cocina and extracts the title (label), image filename, and fund name. I initially throttled the number of active tasks to 5 and retrieving and extracting 817 druids took 22:25 minutes. I will do a second run with active DAG runs set to 10 to see if that improves the performance.
The second run with 10 active tasks and running on Stanford's network brought the time down to 6:41 minutes.