Open jbusecke opened 3 years ago
I've been going through the pipeline with CanESM5 (r1i1p2f1). Most of it has gone smoothly, but I'm having an issue in the last step; the full volume and omz volume aren't conserved across the vertical transformation to density space. There's about a 60% reduction in total volume between the depth-space and sigma-space oceans. Although areacello
and dz_t
are conserved individually, their product is not.
The transformation_wrapper
code is all the same as for previous projects. And I'm not sure how the changes we have made to the preprocessing steps here would affect this. Did you have this issue before @jbusecke ?
That is troubling! Did you make sure to mask the depth values with a mask of where the data is actually present (based on sigma_0
or o2
, or even both.
What could be happening here, is that the 'full volume' counts all cells (because it is constructed from a 1D depth, and has no clue about the actual depth of the ocean) and then the transformation will automatically only register dz
values where you have density and o2 data. What is the total ocean volume. I did build in some masking steps in my original notebook.
Another issue I encountered was a similar issue with partial bottom cells (but that wont cause a change of 60%). Do you remove the bottom most cells?
We could schedule a little debug session tomorrow if you like calendly.
Yes, I'm following the align_missing
(checks nan mask consistency between o2
and thetao
) and the remove_bottom_values
steps from your original notebook. Should removing bottom values deal with the discrepancies between dz
and the bottom bathymetry?
The z-space full volume is ~1.28e18 m^3, and the volume in sigma-space is ~4.5e17 m^3
~1.28e18 m^3
seems realistic to me. So something is not right in the transformed data.
Should removing bottom values deal with the discrepancies between dz and the bottom bathymetry?
There could be issues with partial bottom cells, but I am unsure they would cause this magnitude of difference.
Can you take a single timestep, integrate the total depth (sum of dz_t
) for both depth and density coordinates and plot the difference as a map? I am curious if the differences are close to topography/coasts.
Edit to my earlier comment, areacello
isn't conserved globally (and shouldn't be), but I think the issue might be there. It looks like the area is 0 for all of the densest sigma class (I'm using the old pacific-calibrated sigma values for now). It could just be that CanESM5 has a bias against dense water, but looking into this...
Will look at the figure you suggested as well
I dont quite understand. Are you transforming the area? I dont think that is necessary. The horizontal area is always fixed and the only thing you need to transform is dz_t
.
The area is being transformed because it is cast as a 4-d array before the vertical transformation step (I need to track down exactly where that's happening). And somehow there ends up being no volume/area in the densest bin, even though there are water masses in that density class.
I can try just keeping area as a 2-d array through the pipeline
Here's the difference in the depth-integrated dz_t
for two datasets (z-space minus sigma-space). So it looks like the transformed data tends to be deeper in the interior and shallower on the coasts, but only by 1mm.
I just had another idea: Can you add a very low and very high value as the first and last density bin? If you are not counting some of the bins it will not conserve the total depth.
What are the units in that plot? Meter?
I think cleaning and progressing to all members of CanESM5 is a good next step. Once we got most of the checkmarks done, we can go into production!
Just added match_and_remove_trend
and interpolate_grid_labels
to cmip6_pp. I will release v0.5.0 today which will include these.
It might still take until tomorrow to get everything through at conda-forge, so maybe just install from source for now.
resample(...)
withcoarsen(time=12).mean()
(@sditkovsky)combine_grid_labels
as convenience function in cmip6_pp (makingnested_dataset_wrapper
in processing notebook obsolete) (@jbusecke) https://github.com/jbusecke/cmip6_preprocessing/pull/161For now you can just use a model that has all output on 'gn' ( I think CanESM5 should work) to test the full pipeline.