Catch subdomain configuration errors between training data generation and model training

The data processing step generates forcings from the CM2.6 dataset for the given spatial domain and time resolution. The training step then works on subdomains of this forcing dataset. These subdomains are configured in the training_subdomains.yaml file (or as of #97 , an arbitrary YAML file with similar syntax). xarray doesn't care if a subdomain isn't fully located in the given forcing domain, it simply continues with as much overlap as present. If this overlap is too small, we may get a runtime error stating that the input size is too small for the neural net kernel (5x5). See #42 , #75 .

Going backwards from that error message to the reason is not obvious. We should catch this sort of misconfiguration and warn the user if they might see such an issue. A couple of options:

Assert that all subdomains are fully contained in the forcing data domain. If not, warn about possible misconfiguration.
Assert that subdomain size is appropriate for kernel size. Pytorch does this somewhat for us, but doesn't/can't provide much detail. We'd need to fiddle a bit with the Pytorch model for this e.g. export X and Y kernel sizes -- or maybe it's possible to query a neural net's kernel size in Pytorch?

m2lines / gz21_ocean_momentum

Catch subdomain configuration errors between training data generation and model training #77

85 includes some related code for validating bounding boxes, which I intend to use in the training step too (it'd help for some misconfigurations).