nannau / nc2pt

Serializing NetCDF files for efficient use in deep learning pipelines.
GNU General Public License v3.0
2 stars 0 forks source link

Add Validation Steps to Time Select #11

Closed nannau closed 11 months ago

nannau commented 11 months ago

Recently I ran into a bug where the start times were offset between precip and the other variables.

To make sure we catch this earlier in the pipeline, I added checks that are more stringent for the datetime selection.

As part of the preprocessing pipeline, we now resolve #5

Another issue that I faced was a bug where the torch.randperm was not deterministically random -- i.e. the seed funciton wasn't behaving as I expected. I switched to a rng that uses a numpy RandomState object which seems to have solved the problem.

Basically, batches of dates were not identical across variables. This fixes this.

However, a milestone in this project will be properly testing the tools function of the nc2pt pipeline

It also fixes some issues with caching and test data which explains some of the ci commits.