ECMWFCode4Earth / vAirify

code repository for 2024 Code for Earth project #16
MIT License
1 stars 0 forks source link

Forecast ETL performance improvement - parallel interpolation #41

Closed amehta-scottlogic closed 4 weeks ago

amehta-scottlogic commented 1 month ago

Seb's findings:

While looking at https://github.com/ECMWFCode4Earth/vAirify/issues/27 I found a potential way to reduce the processing time of the > forecast "transformation". Currently, this is done by using ThreadPoolExecutor to loop over all cities and therefore doing a separate > > interpolation for each city. But xarray does provide a way to interpolate all cities in parallel in a single call (see https://docs.xarray.dev/en/stable/user-guide/interpolation.html#advanced-interpolation). Implementing this would require quite some code changes so I made a small proof-of-concept script first comparing both methods. > > On my computer, this reduces processing times from 19 to 0.6 seconds. Would probably be good to have a quick chat about how this > could be best implemented, but I think it is worth it

PoC here: https://github.com/ECMWFCode4Earth/vAirify/blob/parallel-int/air-quality-backend/scripts/test-parallel-interp.py

Acceptance Criteria

Test Checklist:

mwalker-scottlogic commented 1 month ago

Test Analysis

Test Case required:

Test Charter:

Regression candidate:

mwalker-scottlogic commented 1 month ago

Previous exploratory testing notes from 28/05

Image

Test notes from 04/06

Image