NationalMuseumAustralia / Collection-API

The public web API of the National Museum of Australia
10 stars 0 forks source link

Refactoring of the data ingestion pipeline #148

Closed f27wood closed 10 months ago

f27wood commented 4 years ago

Logging this in GitHub as a way of tracking this.

In summary, this is to optimise the ETL pipeline so it does not take as long to process. Agreed to spend 15-20 hours on this.

Conal-Tuohy commented 4 years ago

I think you meant to write "... so that it does not consume as much memory during the process"? I'm honestly not sure what difference this will make to the total time elapsed (it may speed things up, but I'm not counting on it)