Stefan's replay files are larger than the original files generated by Tse Chun. First they are on a 3 degree grid instead of 6 degree and second they are nyears*12 months long instead of 15 months from Tse Chun. Even reading a single 24 month file here is much slower:
Suggestion
numpy files have a lot of extra variables in them left from previous iteration of the ML model. Suggestion is to:
[x] only save in the file what we need (should make the file smaller)
[x] normalize the numpy file by subtracting the mean and dividing by std (currently this is done on the fly)
[x] make one input file that combines 3D and surface variables.
Issue
Stefan's replay files are larger than the original files generated by Tse Chun. First they are on a 3 degree grid instead of 6 degree and second they are nyears*12 months long instead of 15 months from Tse Chun. Even reading a single 24 month file here is much slower:
Suggestion
numpy files have a lot of extra variables in them left from previous iteration of the ML model. Suggestion is to: