NCAR / cmip6_cesm

1 stars 1 forks source link

popeos.eos blows memory with large datasets #4

Open matt-long opened 5 years ago

matt-long commented 5 years ago

popeos.eos implements the equation of state for the POP ocean model. Applying it to a large dataset (number of time level = 816) is producing a memory error. I suspect that this is avoidable if the function is modified to keep things as dask arrays.

Perhaps this is something @jukent can tackle?

kmpaul commented 5 years ago

@jukent Feel free to look into this only if you aren't swamped with your other work.

jukent commented 5 years ago

I might be able to look at this next week.

jukent commented 5 years ago

I am beginning to work on this now

jukent commented 5 years ago

@matt-long Can you link me to a dataset on Glade for use with popeos.py?

matt-long commented 5 years ago

Here you go

dirin = '/glade/p/cesm/community/CESM-DPLE/CESM-DPLE_POPCICEhindcast'
file_salt = f'{dirin}/g.e11_LENS.GECOIAF.T62_g16.009.pop.h.SALT.024901-031612.nc' 
file_temp = f'{dirin}/g.e11_LENS.GECOIAF.T62_g16.009.pop.h.TEMP.024901-031612.nc'

z_t is the depth, present in both those files.

jukent commented 5 years ago

Thanks Matt,

I am trying to enforce the min/max values on the salinity and temperature xarray datasets but that is taking too long to run. Do you know if this is the operation that slowed down the fx when using numpy arrays?

matt-long commented 5 years ago

@jukent, the problem I was having was that the function was blowing memory. I think this resulted from making flat numpy arrays instead of keeping things as dask arrays within xarray.

jukent commented 5 years ago

@matt-long I just added the file popeos_dask Can you see if the problem persists now that the operations have been daskified?

matt-long commented 5 years ago

Thanks. I won't have time to get to this for sometime.