Merging `fv3fit.keras` and `fv3fit.emulation`

Merging these directories has come up a few times now. @frodre and I started the fv3fit.emulation subtree to provide a general purpose library of utilities with a stable API in a time when fv3fit was rapidly changing. The purpose is more to provide composable and well-tested layers than a training CLI. There is also a focus on baking all possible logic into the tensorflow graph itself to simplify serialization. For example, @frodre's prognostic run code does not import fv3fit at all.

These are the main features it has

architecture (e.g. an MLP)
normalization (really comprehensive and well tested thanks to andre)
physics-aware code
- data structures (e.g. ThermoBasis)
- thermodynamics functions (fv3fit.emulation.thermo)
- models
- RH/specific humidity based losses
data loading via tf.data.Dataset.

These features nicely complement the existing code in fv3fit.keras. The main difference is the namespacing convention. For ML libraries it's nice to have some structure in the top-level public namespace to separate different functionalities (e.g. loss functions vs layers). Does anyone oppose adopting this convection in fv3fit.keras? It would be easy to just move all the folders in fv3fit.emulation into fv3fit.keras.

If we merge them, do you think we could try to keep these APIs stable and clean of excessive plumbing code (e.g. config, training loops, serialization, etc), which could remain in fv3fit.keras.models or fv3fit.keras._shared.

Thoughts?

cc @frodre @AnnaKwa @brianhenn @mcgibbon

do you think we could try to keep these APIs stable and clean of excessive plumbing code (e.g. config, training loops, serialization, etc)

I've been finding the pattern of attaching config directly to configurable functionality used when building/training the model works really well for being able to re-use these features, otherwise the config needs to be attached to the feature code independently for each training function that uses the feature.

I'm in favor of some amount of merging, though this should be done carefully. There are several features implemented in both libraries, like normalization, and in general the two folders seem to propose two different patterns for configuring/building models. I think we can maintain these two opinionated ways of doing things, while ensuring that shareable functionality is all together.

ai2cm / fv3net

Merging `fv3fit.keras` and `fv3fit.emulation` #1397