NOAA-OWP / t-route

Tree based hydrologic and hydraulic routing
Other
40 stars 45 forks source link

Fix file-read performance with parallel call #713

Open jameshalgren opened 6 months ago

jameshalgren commented 6 months ago

Running T-Route from Ngen inputs takes far too long due to the back-and-forth reading and writing of qlateral inputs.

There are reasonably sane explanations for how things got that way.

This line seems like a ripe option for improvement with a relatively focused fix: https://github.com/NOAA-OWP/t-route/blame/07d511bb4f93aed23b9031d0b464b17bdf8f2060/src/troute-network/troute/AbstractNetwork.py#L870C9-L870C9

PR forthcoming.

jameshalgren commented 6 months ago

@JoshCu

JoshCu commented 6 months ago

I started looking into this and have a possibly too invasive fix for it, I'm looking into some other less complex modifications that maintain most of the speedup without the additional complexity.

https://github.com/JoshCu/t-route/tree/optimise_parquet

The bottleneck

Currently there are number_of_catchment * number_of_timesteps file io operations so ~51k file writes for the ngiab demo data of ~220 catchments over 240 timesteps.

The required functionality

When building the qlat array before the routing calculations are run, the output nexus files from ngen are one file per catchment containing timestamped data. They need to be pivoted to one file per timestep containing the data for each catchment at that timestep

Considerations for the fix

Room for optimization

The fix I have addresses both of those issues, but the former uses Dask and the latter adds a lot of extra code to handle grouped up parquet files. I'm going to try out a lighter version of the change to get some performance improvements quickly without getting in the parquet file weeds.

Parquet files

From some cursory research it looks like optimal parquet files are anywhere from ~200mb-1gb.
The currently with one per timestep, the largest a file would ever be (worst case) is 2.6mb for every single catchment in the united states. It could be a good idea to make the parquet files instead be a range of timesteps, that would allow a massive reduction in file IO. The current code looks like there are fragments designed to handle reading multiple parquet files to get the time-steps needed, but nothing end to end yet.