Open irm-codebase opened 1 week ago
What is the use case that makes this a bug? In other words, why can't you use model.to_netcdf()
?
In this case, I wanted to just save the results to have a smaller file size. I did save with model.to_netcdf()
.
If you have a stable model build and run many scenarios (e.g., SPORES), saving only the model results, and some metadata on the side, would reduce data use.
Generally, though, just saving the results or inputs makes parsing the data a bit easier. Depending on the mode, some things might be inputs or results (e.g., flow_cap
in operate mode). Users might not have the math fully memorized, so just saving the results is a bit more 'human friendly'.
Maybe this is why we should simply pickle our data - then it would be clear that you need to load it back in using calliope 😅.
But in all seriousness, our preferred method is that data is always loaded in using calliope (read_netcdf
). This way, you automatically get to see what is an input/result, as they are attached as respective attributes of the model object. If we made it easy to save results to file then we'd only come across the opposite problem in future: users trying to load a netcdf file using calliope and then finding it doesn't contain any of the necessary metadata to actually instatiate the model object.
Generally, input data has a much smaller memory footprint than results. Still, in the use-case you have given (individual SPORES runs), it's clearly preferable not to save the input data every single time a new set of results is generated. How about calliope.to_netcdf(include_inputs=False)
?
@brynpickering I would not pickle the data. Pickle files are not always translatable between python or library versions. It would decrease stability in the long run 😅😅...
I'd actually prefer to just keep calliope.to_netcdf()
without anything. Adding extra things would just decrease maintainability instead of fixing the root of the issue, which is that we are not following netCDF4 specifications.
I suppose it would introduce the chance of someone trying to re-load results
... but that is user misconception, not calliope code breaking compatibility with data formats. I do not see it as a big loss, but I get the point.
Should I close this as 'not planned'?
I'd actually prefer to just keep calliope.to_netcdf() without anything
Except you also want to add calliope.results.to_netcdf
, right?
I meant that I'm ok with just using calliope.to_netcdf()
if it means we do not sideline the spec issue with calliope.to_netcdf(include_inputs=False)
🙈.
I can get around it in other ways on my side, and if it has not popped up yet it means most people do not do it.
Just to clarify: we already have calliope.results.to_netcdf()
, since results
is an xarray.
So I was not really requesting us to add any new functions.
This issue was just about making sure using xarray functionality with it works when saving to netCDF.
What happened?
Problem
This is related to other problems (#619), but it's closer to an actual user case, so I'll document it here.
Due to our approach of saving everything in the
.attrs
of thexarray
, we cannot directly save model results. This is a reasonable user-case, as saving results is semantically intuitive.Solution
Not saving dictionaries or arrays in the
.attrs
Which operating systems have you used?
Version
v0.7.0.dev4
Relevant log output