This PR moves array reshaping and encoding/decoding of inputs and outputs into the base reservoir models. Previously these were done by the hybrid adapter. Having the base models handle these operations makes offline predictions easier and simplifies the public API for the base models. Input into the hybrid model predict method is assumed to be in the native (x,y,z) format instead of a flattened array. Now, predict returns np arrays for each variable in the native (x, y, z) coords.
Made the autoencoder a required arg instead of optional to the reservoir models. All models trained using the training code have an autoencoder by default.
Significant internal changes:
Refactored above functionality out of the hybrid adapter, now it handles converting xr datasets to arrays but otherwise relies on its base model to handle reshaping and encoding/decoding.
Moved the DoNothingAutoencoder to reservoir.transformers.transformer since it is a generally useful class for tests and even training (if you do not want to standardize inputs). Added dump/load methods to it.
Added helper functions encode_columns, decode_columns to fv3fit.reservoir.transformers that apply the transformations to each column of data in (x, y ,z) coordinates. The decode function is basically copied over from the original adapter method. The encoding function was originally replicated in two places, the adapter and the training module. I copied the training module version over since there were some tensor formatting issues with training time series data in the adapter version. Other than that difference, the two functions produce the same encoded result (I wrote a test to check this before deleting the adapter method).
Save the non-null feature_dim_sizes for transformers (this is set for SkTransformer and DoNothingAutoencoder the first time encode is called) so that subsequent calls of decode after loading have this information available already.
Refactored public API:
predict
method is assumed to be in the native (x,y,z) format instead of a flattened array. Now,predict
returns np arrays for each variable in the native (x, y, z) coords.autoencoder
a required arg instead of optional to the reservoir models. All models trained using the training code have an autoencoder by default.Significant internal changes:
DoNothingAutoencoder
toreservoir.transformers.transformer
since it is a generally useful class for tests and even training (if you do not want to standardize inputs). Added dump/load methods to it.encode_columns, decode_columns
tofv3fit.reservoir.transformers
that apply the transformations to each column of data in (x, y ,z) coordinates. The decode function is basically copied over from the original adapter method. The encoding function was originally replicated in two places, the adapter and the training module. I copied the training module version over since there were some tensor formatting issues with training time series data in the adapter version. Other than that difference, the two functions produce the same encoded result (I wrote a test to check this before deleting the adapter method).feature_dim_sizes
for transformers (this is set forSkTransformer
andDoNothingAutoencoder
the first time encode is called) so that subsequent calls ofdecode
after loading have this information available already.Updated tests to expect the new
Resolves https://github.com/ai2cm/fv3net/issues/2254