Closed AndyHoggANU closed 5 years ago
This happens for all 5 ice variables, BTW.
Odd. It didn't for me, but I only tested one variable. Was I not using the whole range?
Yes - the code on github specified output0[0-2]?, which was left over from my last failed attempt. I think I have a way of figuring out where this one is, so leave it with me for a couple of hours.
It is a fragile process, unfortunately, so I should have been testing with the full production dataset. As you're finding out just how fragile it is. Happy to revisit as I've now gained a lot of unwanted expertise ...
OK, so some good news at last. The first thing is that processing in 4-5 year chunks will work and is not too tricky. It turns out that:
output0[0-2]?
selects just 1985-89, output0[3-5]?
selects just 1990-94, output0[6-8]?
selects just 1995-99, output1[2-4]?
selects just 2005-09, output1[5-7]?
selects just 2010-14, output1[8-9]?
selects just 2015-17, The issue is with output09?
and output1[0-1]?
, which I can pass through using two filename arguments, but this is where the TLON error is. Drilling down into that one now.
OK, here is another curiosity. My 1985-89 ice files take a lot longer to process, and produce much larger output files, than any other 5-year segment.
amh157@raijin2:ice %% du -sh ./hi-m/*
631M ./hi-m/hi-m_access-om2-01_198501_198512.nc
631M ./hi-m/hi-m_access-om2-01_198601_198612.nc
631M ./hi-m/hi-m_access-om2-01_198701_198712.nc
631M ./hi-m/hi-m_access-om2-01_198801_198812.nc
631M ./hi-m/hi-m_access-om2-01_198901_198912.nc
88M ./hi-m/hi-m_access-om2-01_199001_199012.nc
88M ./hi-m/hi-m_access-om2-01_199101_199112.nc
88M ./hi-m/hi-m_access-om2-01_199201_199212.nc
88M ./hi-m/hi-m_access-om2-01_199301_199312.nc
...
Yet, I can't see any real difference in the final files? Is this due to the way the originals are compressed? Any hints here? I'm not so worried about the size, just the consistency ...
So, this TLON error is somewhere in 2004 ...
The larger files aren't compressed.
hi-m_access-om2-01_198501_198512
:
float hi_m(time, nj, ni) ;
hi_m:_FillValue = 1.e+30f ;
hi_m:units = "m" ;
hi_m:long_name = "grid cell mean ice thickness" ;
hi_m:cell_measures = "area: tarea" ;
hi_m:cell_methods = "time: mean" ;
hi_m:time_rep = "averaged" ;
hi_m:coordinates = "TLON ULAT TLAT ULON" ;
hi_m:_Storage = "chunked" ;
hi_m:_ChunkSizes = 1, 675, 900 ;
hi_m:_Endianness = "little" ;
hi-m_access-om2-01_199001_199012
float hi_m(time, nj, ni) ;
hi_m:_FillValue = 1.e+30f ;
hi_m:units = "m" ;
hi_m:long_name = "grid cell mean ice thickness" ;
hi_m:cell_measures = "area: tarea" ;
hi_m:cell_methods = "time: mean" ;
hi_m:time_rep = "averaged" ;
hi_m:coordinates = "TLON TLAT ULON ULAT" ;
hi_m:_Storage = "chunked" ;
hi_m:_ChunkSizes = 1, 675, 900 ;
hi_m:_DeflateLevel = 5 ;
hi_m:_Shuffle = "true" ;
hi_m:_Endianness = "little" ;
I'm not sure why that would be the case. The input files are compressed.
Hmm. Curious. Re-run these and they still seem to be uncompressed ... I guess I should just compress them and be done with it?
OK, for the TLON error, I have reduced it down to a difference between output114 and output115 ... but I can't see any difference. Can you take a look and see if you see anything?
What does splitvar use to read it in? I might try and emulate that to see if I can recreate the bug in python.
The land masking changes between iceh.2004-02.nc
and iceh.2004-03.nc
I'll look into a work-around
I've pushed a fix https://github.com/coecms/splitvar/commit/0ff571cb823a6a16a6a8b151505d4d4f523fbb24 and will install into conda
Oh, that's weird. OK, can you let me know when the conda
is updated on raijin and I will give it a go. Thanks!!
Oh, now I see why that works, it concatenates TLON
and adds a time dimension:
>>> ds = xarray.open_mfdataset('/g/data3/hh5/tmp/cosima/access-om2-01/01deg_jra55v13_iaf/output11[4-5]/ice/OUTPUT/iceh.2004-0[2-3].nc',decode_cf=False, engine='netcdf4')
>>> ds.TLON
<xarray.DataArray 'TLON' (time: 2, nj: 2700, ni: 3600)>
dask.array<shape=(2, 2700, 3600), dtype=float32, chunksize=(1, 2700, 3600)>
Coordinates:
* time (time) float64 6.999e+03 7.03e+03
Dimensions without coordinates: nj, ni
Attributes:
long_name: T grid center longitude
units: degrees_east
missing_value: 1e+30
_FillValue: 1e+30
Ok, maybe that isn't a good idea. If you try and use open_mfdataset
on these split files it will have issues.
OK, but it used to work on the original files, so why would it fail now?
Compression is done, BTW.
For what it's worth, I have collated the remaining ice files (2000-2004) but there seem to be some compression issues creeping back in.
Latest news: Mostly good. I have reprocessed all the ice files for access-om2-01. I still had to do it in 7 batches, which took some time, but that could be easily programmed into the bash script in the future. The files were all the same size ... but all uncompressed. That's OK, as compressing is just a small extra step which can be done in one hit, so is a small impost.
The only catch that i can see is that TLON masking is still different at the start and end of the run:
Did we expect this might be fixed? I did ... but I'm not worried about it provided that xarray can read in the whole dataset in one hit. I will try to test this. If it can't, then we may need to come back to this code and try again!!
Update
I see what has happened here. xarray
has interpreted the changing TLON
as needing it's own time dimension as suggested by Aidan above:
<xarray.Dataset>
Dimensions: (d2: 2, ni: 3600, nj: 2700, time: 216)
Coordinates:
* time (time) datetime64[ns] 2000-01-16 2000-02-15 ... 2017-12-16
TLON (time, nj, ni) float32 dask.array<shape=(216, 2700, 3600), chunksize=(12, 2700, 3600)>
TLAT (time, nj, ni) float32 dask.array<shape=(216, 2700, 3600), chunksize=(12, 2700, 3600)>
ULAT (time, nj, ni) float32 dask.array<shape=(216, 2700, 3600), chunksize=(12, 2700, 3600)>
ULON (time, nj, ni) float32 dask.array<shape=(216, 2700, 3600), chunksize=(12, 2700, 3600)>
Dimensions without coordinates: d2, ni, nj
Data variables:
aice_m (time, nj, ni) float32 dask.array<shape=(216, 2700, 3600), chunksize=(12, 2700, 3600)>
tarea (time, nj, ni) float32 dask.array<shape=(216, 2700, 3600), chunksize=(12, 2700, 3600)>
As it stands, this uses up extra space, but it is usable, in that the dataset can be loaded by xarray, etc. I think this means that either the -a grid.nc
argument hasn't worked, or else I haven't applied it properly??
I'm finally coming back to this data processing. Note that, while the ice files are all processed, we still have the problem that all files retain the time-dependent grid information. @aidanheerdegen - is there a workaround for this?
The code has this comment:
# Make a grid file because we're going to delete all the grid information and add it back
# as it isn't consistent across the data
#ncks -O -v TLON,TLAT,ULON,ULAT,NCAT,tmask,uarea,tarea,blkmask,dxt,dyt,dxu,dyu,HTN,HTE,ANGLE,ANGLET ${COSIMADIR}/${MODEL}/${EXPT}/output197/${SUBMODEL}/OUTPUT/iceh.????-12.nc grid.nc
which sounds what I was suggesting. I take it this doesn't work?
OK, I am testing this with the 1° ice cases -- it produces a grid.nc file for me, with no time-dependence in the TLON, TLAT etc -- but from what I can see the data files retain time-dependence. So, I am guessing that -a grid.nc
argument isn't working as intended?
OK - my bad! My script was picking up the stable version of conda/analysis3 -- I needed unstable of course! Testing this now - will close if all works.
OK - it works at 1°. I still need to test at higher resolution but let's be optimistic and close this for now!
I started reprocessing the 0.1° ice data, as last night's successful run only did up to output029. When I applied it to the whole dataset, this error resurfaced: