eguil / Density_bining

Density bining code
2 stars 5 forks source link

Integrate in Paul's model driver #7

Closed eguil closed 9 years ago

durack1 commented 10 years ago

So plan forward:

  1. Wrap single version of realisations (Paul to provide trimModelList code to reduce directory list of files to the latest versions of rXiXpX only)
  2. Bin density for monthly means and save to file (native grid)
  3. Generate annual mean (so, thetao, depth, thickness) from monthly means and regrid to WOA (1x1) and generate basin zonal mean values (marginal seas excluded)

WOA09 grid files: crunchy:/work/durack1/csiro/climatologies/WOA09_nc/annual WOA09 mask info: http://iridl.ldeo.columbia.edu/SOURCES/.NOAA/.NODC/.WOA09/.Masks/ (likely need to rewrite to netcdf - Paul)

durack1 commented 10 years ago

Ok so I seem to be up-to-date on the merges and have your latest version of code.. How big are these output files for CCSM4 and the other test files?

Just wondering how much space I need to start processing on the whole archive..?

eguil commented 10 years ago

so far I've run the 150 years with the option -n to not have a huge monthly bin data on source grid. IPSL monthly bined is about 240 MB/year, GFDL CM2p1 about 3 times this. Maybe we should think of splitting these monthly files ? I have not looked at them that much (mostly the zonal mean and persistence).

The zonal and annual on target grid are pretty light.

On 26/8/14 22:49, Paul J. Durack wrote:

Ok so I seem to be up-to-date on the merges and have your latest version of code.. How big are these output files for CCSM4 and the other test files?

Just wondering how much space I need to start processing on the whole archive..?

— Reply to this email directly or view it on GitHub https://github.com/eguil/Density_bining/issues/7#issuecomment-53489314.

Eric Guilyardi IPSL/LOCEAN - Dir. Rech. CNRS Tour 45, 4eme, piece 406 UPMC, case 100 4 place Jussieu, F-75252 Paris Tel: +33 (0)1 44 27 70 76 Prof. Eric Guilyardi NCAS Climate Meteorology Department University of Reading Reading RG6 6BB - UK Tel: +44 (0)118 378 8315

             http://ncas-climate.nerc.ac.uk/~ericg

Currently visiting Lawrence Berkeley and Livermore National Laboratories, California

durack1 commented 10 years ago

Did you take a look how much we save if we consider just the annual mean input data rather than calculating the monthly mean and then generating the annual mean from this? It would also speed things up by an order of magnitude..

eguil commented 10 years ago

well, we would save only 30-40% of CPU if we worked on the annual mean (plus we could not compute the persistence variables). No I have not tried but my previous experience shows the non-linearities are large, especially where the seasonal mixed layer is large. We could save that amount of CPU by making the persistence computation more efficient.

On 26/8/14 23:26, Paul J. Durack wrote:

Did you take a look how much we save if we consider just the annual mean input data rather than calculating the monthly mean and then generating the annual mean from this? It would also speed things up by an order of magnitude..

— Reply to this email directly or view it on GitHub https://github.com/eguil/Density_bining/issues/7#issuecomment-53495266.

Eric Guilyardi IPSL/LOCEAN - Dir. Rech. CNRS Tour 45, 4eme, piece 406 UPMC, case 100 4 place Jussieu, F-75252 Paris Tel: +33 (0)1 44 27 70 76 Prof. Eric Guilyardi NCAS Climate Meteorology Department University of Reading Reading RG6 6BB - UK Tel: +44 (0)118 378 8315

             http://ncas-climate.nerc.ac.uk/~ericg

Currently visiting Lawrence Berkeley and Livermore National Laboratories, California

durack1 commented 10 years ago

Ok well probably time to figure out where to put this.. And kick things off..

When I have some time to play with this it would certainly be interesting to compare the results from the monthly mean computations with those based on annual mean computations, I'm particularly considering changes over time here.. The persistence etc is another topic/focus altogether..

durack1 commented 10 years ago

Ok starting to make some progress on this.. I've started converting the densit_bin.py (renamed binDensity.py) into a function densityBin and have the driver sorted out.. Just need to get this running on a couple of models, verifying that the output is correct (probably with your 4 test cases) and then we're off

https://github.com/durack1/Density_bining

durack1 commented 10 years ago

I'm just rewriting the creation of output files, and am wondering what the sizes of the output files are approximately.. And wondering if there's a neater way of writing these out:

    # Monthly mean of T,S, thickness and depth on neutral density bins on source grid
    file_out = outdir+'/'+modeln+'_out_1m_density.nc'
    # Annual zonal mean of T,S, thick, depth and volume per basin on WOA grid
    filez_out = outdir+'/'+modeln+'_outz_1y_density.nc'
    # Annual mean persistence variables on WOA grid 
    fileq_out = outdir+'/'+modeln+'_out_1y_persist.nc'
    # Annual mean zonal mean of persistence on WOA grid 
    filep_out = outdir+'/'+modeln+'_outz_1y_persist.nc'
eguil commented 10 years ago

On 15/9/14 20:11, Paul J. Durack wrote:

I'm just rewriting the creation of output files, and am wondering what the sizes of the output files are approximately.. And wondering if there's a neater way of writing these out:

what do you mean by 'neater' ? Less files ? we could blend the 2 outz (zonal mean) as they are on the same grid but the 2 other are on different grid. One option is to interpolate the first one onto WOA as well and blend with the 3rd one. Here are the sizes: # Monthly mean of T,S, thickness and depth on neutral density bins on source grid file_out = outdir+'/'+modeln+'_out_1m_density.nc' this one varies in size as it is on the source grid (~6 Gb for 20 years for IPSL-CM5A-LR, 182x149x61)
 # Annual zonal mean of T,S, thick, depth and volume per basin on WOA grid
 filez_out = outdir+'/'+modeln+'_outz_1y_density.nc'|
275 MB for 150 years
 # Annual mean persistence variables on WOA grid
 fileq_out = outdir+'/'+modeln+'_out_1y_persist.nc'|
200 Mb for 150 years
 # Annual mean zonal mean of persistence on WOA grid
 filep_out = outdir+'/'+modeln+'_outz_1y_persist.nc'|
60 MB for 150 years

— Reply to this email directly or view it on GitHub https://github.com/eguil/Density_bining/issues/7#issuecomment-55632493.

Eric Guilyardi IPSL/LOCEAN - Dir. Rech. CNRS Tour 45, 4eme, piece 406 UPMC, case 100 4 place Jussieu, F-75252 Paris Tel: +33 (0)1 44 27 70 76 Prof. Eric Guilyardi NCAS Climate Meteorology Department University of Reading Reading RG6 6BB - UK Tel: +44 (0)118 378 8315

             http://ncas-climate.nerc.ac.uk/~ericg
eguil commented 10 years ago

ok great. Note that the CCSM4 test does not work.

On 15/9/14 06:47, Paul J. Durack wrote:

Ok starting to make some progress on this.. I've started converting the densit_bin.py (renamed binDensity.py) into a function densityBin and have the driver sorted out.. Just need to get this running on a couple of models, verifying that the output is correct (probably with your 4 test cases) and then we're off

https://github.com/durack1/Density_bining

— Reply to this email directly or view it on GitHub https://github.com/eguil/Density_bining/issues/7#issuecomment-55552699.

Eric Guilyardi IPSL/LOCEAN - Dir. Rech. CNRS Tour 45, 4eme, piece 406 UPMC, case 100 4 place Jussieu, F-75252 Paris Tel: +33 (0)1 44 27 70 76 Prof. Eric Guilyardi NCAS Climate Meteorology Department University of Reading Reading RG6 6BB - UK Tel: +44 (0)118 378 8315

             http://ncas-climate.nerc.ac.uk/~ericg
durack1 commented 10 years ago

Do you have the MB -> GB correct above? I'd be surprised if we get away with files <GB

eguil commented 10 years ago

yep. annual mean zonal mean are small files (some fields - properties on the bowl - are even 1D)!

On 15/9/14 23:46, Paul J. Durack wrote:

Do you have the MB -> GB correct above? I'd be surprised if we get away with files <GB

— Reply to this email directly or view it on GitHub https://github.com/eguil/Density_bining/issues/7#issuecomment-55664045.

Eric Guilyardi IPSL/LOCEAN - Dir. Rech. CNRS Tour 45, 4eme, piece 406 UPMC, case 100 4 place Jussieu, F-75252 Paris Tel: +33 (0)1 44 27 70 76 Prof. Eric Guilyardi NCAS Climate Meteorology Department University of Reading Reading RG6 6BB - UK Tel: +44 (0)118 378 8315

             http://ncas-climate.nerc.ac.uk/~ericg
durack1 commented 10 years ago

Ok great, well then I suggest we roll all the interpolated fields into a single file - above that would mean a ~0.5GB file for IPSL, which if I turn on compression should drop to ~0.25GB

eguil commented 10 years ago

my IDL programs get confused if there are too many axis and dimensions in the fields stored in one file. If it simplifies things for you to have one file, then I can always extract the original files with nco. The figures I gave you for the last 3 files is not dependent on the model as it is interpolated on the WOA grid.

ok, off to bed !

On 15/9/14 23:51, Paul J. Durack wrote:

Ok great, well then I suggest we roll all the interpolated fields into a single file - above that would mean a ~0.5GB file for IPSL, which if I turn on compression should drop to ~0.25GB

— Reply to this email directly or view it on GitHub https://github.com/eguil/Density_bining/issues/7#issuecomment-55664691.

Eric Guilyardi IPSL/LOCEAN - Dir. Rech. CNRS Tour 45, 4eme, piece 406 UPMC, case 100 4 place Jussieu, F-75252 Paris Tel: +33 (0)1 44 27 70 76 Prof. Eric Guilyardi NCAS Climate Meteorology Department University of Reading Reading RG6 6BB - UK Tel: +44 (0)118 378 8315

             http://ncas-climate.nerc.ac.uk/~ericg
durack1 commented 10 years ago

Regarding IDL too.. It does seem that IDL 8.3 (latest) does have access to the netcdf4 libraries and consequently should be able to easily deal with netcdf compression.. If you're not up-to-date with this, it could be worth upgrading.. CMIP6 data will likely be compressed as provided by the modelling centres

eguil commented 10 years ago

ok I will look into this. How are the driver tests going ? Can you remind me how I can get your modified binDensity routine into my github ? Thanks On 16/9/14 00:07, Paul J. Durack wrote:

Regarding IDL too.. It does seem that IDL 8.3 (latest) does have access to the netcdf4 libraries and consequently should be able to easily deal with netcdf compression.. If you're not up-to-date with this, it could be worth upgrading.. CMIP6 data will likely be compressed as provided by the modelling centres

— Reply to this email directly or view it on GitHub https://github.com/eguil/Density_bining/issues/7#issuecomment-55666640.

Eric Guilyardi IPSL/LOCEAN - Dir. Rech. CNRS Tour 45, 4eme, piece 406 UPMC, case 100 4 place Jussieu, F-75252 Paris Tel: +33 (0)1 44 27 70 76 Prof. Eric Guilyardi NCAS Climate Meteorology Department University of Reading Reading RG6 6BB - UK Tel: +44 (0)118 378 8315

             http://ncas-climate.nerc.ac.uk/~ericg
durack1 commented 10 years ago

I've started hitting some issues with grids etc.. So am falling back to good 'ol IPSL-CM5A-LR and trying to replicate your results.. It certainly would be useful getting you entrained in this, you've probably seen these error messages before!

To get my changes back into your repo you'd have to add my repo as a remote.. Some details attempting to outline this step can be found on https://github.com/PCMDI/pcmdi_metrics/wiki/developers - we likely need to update this information, and add more command line snippets etc..

eguil commented 10 years ago

for IPSL, the zonal mean 150 yrs annual file I checked is here: /work/guilyardi/Density_bining/IPSLCM5AoDB_1y_1851_2005_grdenz.nc Yes, let me know what kind of errors you get. I do get some warnings when I run but noting that makes the code crash.

On 17/9/14 15:34, Paul J. Durack wrote:

I've started hitting some issues with grids etc.. So am falling back to good 'ol IPSL-CM5A-LR and trying to replicate your results.. It certainly would be useful getting you entrained in this, you've probably seen these error messages before!

To get my changes back into your repo you'd have to add my repo as a remote.. Some details attempting to outline this step can be found on https://github.com/PCMDI/pcmdi_metrics/wiki/developers - we likely need to update this information, and add more command line snippets etc..

— Reply to this email directly or view it on GitHub https://github.com/eguil/Density_bining/issues/7#issuecomment-55893640.

Eric Guilyardi IPSL/LOCEAN - Dir. Rech. CNRS Tour 45, 4eme, piece 406 UPMC, case 100 4 place Jussieu, F-75252 Paris Tel: +33 (0)1 44 27 70 76 Prof. Eric Guilyardi NCAS Climate Meteorology Department University of Reading Reading RG6 6BB - UK Tel: +44 (0)118 378 8315

             http://ncas-climate.nerc.ac.uk/~ericg
durack1 commented 10 years ago

This is the current error - if you manage to get my code across to your repo, you should be able to repeat this issue by running drive_IPSL.py:

Traceback (most recent call last):
  File ".//drive_IPSL.py", line 28, in <module>
  File "/export/durack1/git/Density_bining/binDensity.py", line 812, in densityBin
    persisti [t,ks,:,:]         = regridObj(persbin[t,ks,:,:])
  File "/usr/local/uvcdat/2014-09-16/lib/python2.7/site-packages/numpy/ma/core.py", line 3079, in __setitem__
    ndarray.__setitem__(_data, indx, dval)
ValueError: could not broadcast input array from shape (180,360) into shape (149,182)
[durack1@oceanonly Density_bining]$
durack1 commented 10 years ago

I would note that the issue above is because I've made an error somewhere, so the interpolation hasn't happened.. Yesterday I was getting errors ValueError: object too deep for desired array

Home time, will have to get back to this tomorrow.. Let me know if you have any luck getting my code (of your code) running..

eguil commented 10 years ago

ok, I found the bug. You had a wrong init of array dims: line 451 should become

 persisti   = npy.ma.ones([nyrtc, N_s+1, Nji, Nii], 

dtype='float32')*valmask persistia,persistip,persistii,persistv = [npy.ma.ones(npy.shape(persisti)) for _ in range(4)]

I am still confused on how to work together on github... Maybe we could do a quick skype at some pointy next week ?

On 18/9/14 03:07, Paul J. Durack wrote:

I would note that the issue above is because I've made an error somewhere, so the interpolation hasn't happened.. Yesterday I was getting errors |ValueError: object too deep for desired array|

— Reply to this email directly or view it on GitHub https://github.com/eguil/Density_bining/issues/7#issuecomment-55983167.

Eric Guilyardi IPSL/LOCEAN - Dir. Rech. CNRS Tour 45, 4eme, piece 406 UPMC, case 100 4 place Jussieu, F-75252 Paris Tel: +33 (0)1 44 27 70 76 Prof. Eric Guilyardi NCAS Climate Meteorology Department University of Reading Reading RG6 6BB - UK Tel: +44 (0)118 378 8315

             http://ncas-climate.nerc.ac.uk/~ericg
eguil commented 10 years ago

it seems to me the "too deep" bug is because the ptopd variable is 3D whereas the previous one is 2D. As I had them in different files, this issue did not come up for me. Maybe we should keep 2 separate files (3D and 2D) ?

On 17/9/14 15:34, Paul J. Durack wrote:

I've started hitting some issues with grids etc.. So am falling back to good 'ol IPSL-CM5A-LR and trying to replicate your results.. It certainly would be useful getting you entrained in this, you've probably seen these error messages before!

To get my changes back into your repo you'd have to add my repo as a remote.. Some details attempting to outline this step can be found on https://github.com/PCMDI/pcmdi_metrics/wiki/developers - we likely need to update this information, and add more command line snippets etc..

— Reply to this email directly or view it on GitHub https://github.com/eguil/Density_bining/issues/7#issuecomment-55893640.

Eric Guilyardi IPSL/LOCEAN - Dir. Rech. CNRS Tour 45, 4eme, piece 406 UPMC, case 100 4 place Jussieu, F-75252 Paris Tel: +33 (0)1 44 27 70 76 Prof. Eric Guilyardi NCAS Climate Meteorology Department University of Reading Reading RG6 6BB - UK Tel: +44 (0)118 378 8315

             http://ncas-climate.nerc.ac.uk/~ericg
durack1 commented 10 years ago

As we're generating 191 density files from the historical simulation, I'd certainly advocate for less is better..

Did/have you managed to pull my changes (new files) into your repo.. As I left your file intact, they should both be co-existing happily..

There's also going to be some issues with grids, I tried ACCESS1.0 as the initial file, so some reconfiguring of functions will be required once I'm able to regenerate your IPSL results..

durack1 commented 10 years ago

Ok and the "too deep" could be because the wrong axes are being allocated.. Maybe, I'll look at this again today..

RE: Skype, let me know when.. Sooner is probably better..

eguil commented 10 years ago

Re Skype: Monday 6pm for me 9am for you ?

On 18/9/14 15:24, Paul J. Durack wrote:

Ok and the "too deep" could be because the wrong axes are being allocated.. Maybe, I'll look at this again today..

RE: Skype, let me know when.. Sooner is probably better..

— Reply to this email directly or view it on GitHub https://github.com/eguil/Density_bining/issues/7#issuecomment-56037094.

Eric Guilyardi IPSL/LOCEAN - Dir. Rech. CNRS Tour 45, 4eme, piece 406 UPMC, case 100 4 place Jussieu, F-75252 Paris Tel: +33 (0)1 44 27 70 76 Prof. Eric Guilyardi NCAS Climate Meteorology Department University of Reading Reading RG6 6BB - UK Tel: +44 (0)118 378 8315

             http://ncas-climate.nerc.ac.uk/~ericg
durack1 commented 10 years ago

Ok so your fix got me further, but I'm now back at the ValueError: object too deep for desired array

persim:  (10, 180, 360)
ptopd:   (10, 180, 360)
Traceback (most recent call last):
  File ".//drive_IPSL.py", line 28, in <module>
    # Call densityBin
  File "/export/durack1/git/Density_bining/binDensity.py", line 1039, in densityBin
    outFile_f.write(ptopd  , extend = 1, index = (trmin-tmin)/12)
  File "/usr/local/uvcdat/2014-09-16/lib/python2.7/site-packages/cdms2/dataset.py", line 1792, in write
    v[index:index+len(vec1)] = var.astype(v.dtype)
  File "/usr/local/uvcdat/2014-09-16/lib/python2.7/site-packages/cdms2/fvariable.py", line 119, in __setslice__
    apply(self._obj_.setslice,(low,high,numpy.ma.filled(value)))
ValueError: object too deep for desired array

I've probably got another wrongly preallocated variable.. Seems to me like the persim and ptopd are both 3D variables, whereas I get the impression that ptopd shouldn't be (I really need to update these names, or at the very least understand what each of these are)..

eguil commented 10 years ago

I believe you have to create two files and write 2D fields in one and 3D in another one. That should solve the "too deep" problem

On 18/9/14 21:24, Paul J. Durack wrote:

Ok so your fix got me further, but I'm now back at the |ValueError: object too deep for desired array|

|persim: (10, 180, 360) ptopd: (10, 180, 360) Traceback (most recent call last): File ".//drive_IPSL.py", line 28, in

Call densityBin

File "/export/durack1/git/Density_bining/binDensity.py", line 1039, in densityBin outFile_f.write(ptopd , extend = 1, index = (trmin-tmin)/12) File "/usr/local/uvcdat/2014-09-16/lib/python2.7/site-packages/cdms2/dataset.py", line 1792, in write v[index:index+len(vec1)] = var.astype(v.dtype) File "/usr/local/uvcdat/2014-09-16/lib/python2.7/site-packages/cdms2/fvariable.py", line 119, in setslice apply(self.obj.setslice,(low,high,numpy.ma.filled(value))) ValueError: object too deep for desired array |

I've probably got another wrongly preallocated variable..

— Reply to this email directly or view it on GitHub https://github.com/eguil/Density_bining/issues/7#issuecomment-56089158.

Eric Guilyardi IPSL/LOCEAN - Dir. Rech. CNRS Tour 45, 4eme, piece 406 UPMC, case 100 4 place Jussieu, F-75252 Paris Tel: +33 (0)1 44 27 70 76 Prof. Eric Guilyardi NCAS Climate Meteorology Department University of Reading Reading RG6 6BB - UK Tel: +44 (0)118 378 8315

             http://ncas-climate.nerc.ac.uk/~ericg
durack1 commented 10 years ago

Nope, the problem was that ptopdepth was declared more than once as a variable id, and when trying to write to the file using the extend=1, it's the wrong size as the already existing ptopdepth variable..

I'm not following what each of these variables are, so will have to give them more easy to recognize identifiers..

If all goes well I should have a completed IPSL-CM5A-LR file in a couple of hours.. Yay..

RE: Skype, yeah Monday sounds fine 22nd @9am

eguil commented 10 years ago

all right - let me know how it goes and if there are files I can check I agree the names are not the best but I like to keep things as short as possible.

On 19/9/14 00:31, Paul J. Durack wrote:

Nope, the problem was that ptopdepth was declared more than once as a variable id, and when trying to write to the file using the extend=1, it's the wrong size as the other ptopdepth variable..

I'm not following what each of these variables are, so will have to give them more easy to recognize identifiers..

If all goes well I should have a completed IPSL-CM5A-LR file in a couple of hours.. Yay..

RE: Skype, yeah Monday sounds fine 22nd @9am

— Reply to this email directly or view it on GitHub https://github.com/eguil/Density_bining/issues/7#issuecomment-56112807.

Eric Guilyardi IPSL/LOCEAN - Dir. Rech. CNRS Tour 45, 4eme, piece 406 UPMC, case 100 4 place Jussieu, F-75252 Paris Tel: +33 (0)1 44 27 70 76 Prof. Eric Guilyardi NCAS Climate Meteorology Department University of Reading Reading RG6 6BB - UK Tel: +44 (0)118 378 8315

             http://ncas-climate.nerc.ac.uk/~ericg
durack1 commented 10 years ago

Ok great, so this is the output.. I'll dump the file on the web somewhere (and email you a link) so you can check it's contents.. This is the output, do timings/memory usage look about right?

 --> time chunk (bounds) =  13 / 15  ( 1560 1679 ) IPSL-CM5A-LR
     [Change units to celsius]
   CPU of density bining      = 103.06
   CPU of annual mean compute = 47.26
   CPU of interpolation       = 53.93
   CPU of zonal mean          = 48.84
   CPU of persistence compute = 403.58
   CPU of chunk               = 656.74
 --> time chunk (bounds) =  14 / 15  ( 1680 1799 ) IPSL-CM5A-LR
     [Change units to celsius]
   CPU of density bining      = 236.3
   CPU of annual mean compute = 73.74
   CPU of interpolation       = 49.76
   CPU of zonal mean          = 33.24
   CPU of persistence compute = 448.17
   CPU of chunk               = 841.29
 [ Time stamp 18/09/2014 17:42:47 ]
 Max memory use 26.04918 GB
 Ratio to grid*nyears 0.198632408277 kB/unit(size*nyears)
 CPU use, elapsed 9467.16 9511.05632401
 Ratio to grid*nyears 6.01581569407 1.e-6 sec/unit(size*nyears)
 Wrote file:  test/cmip5.IPSL-CM5A-LR.historical.r1i1p1.an.ocn.Omon.density.ver-v20111119.nc
eguil commented 10 years ago

yes, they look fine (actually better than the ones I had !)

On 19/9/14 16:39, Paul J. Durack wrote:

Ok great, so this is the output.. I'll dump the file on the web somewhere (and email you a link) so you can check it's contents.. This is the output, do timings/memory usage look about right?

--> time chunk (bounds) = 13 / 15 ( 1560 1679 ) IPSL-CM5A-LR [Change units to celsius] CPU of density bining = 103.06 CPU of annual mean compute = 47.26 CPU of interpolation = 53.93 CPU of zonal mean = 48.84 CPU of persistence compute = 403.58 CPU of chunk = 656.74 --> time chunk (bounds) = 14 / 15 ( 1680 1799 ) IPSL-CM5A-LR [Change units to celsius] CPU of density bining = 236.3 CPU of annual mean compute = 73.74 CPU of interpolation = 49.76 CPU of zonal mean = 33.24 CPU of persistence compute = 448.17 CPU of chunk = 841.29 [ Time stamp 18/09/2014 17:42:47 ] Max memory use 26.04918 GB Ratio to grid_nyears 0.198632408277 kB/unit(size_nyears) CPU use, elapsed 9467.16 9511.05632401 Ratio to grid_nyears 6.01581569407 1.e-6 sec/unit(size_nyears) Wrote file: test/cmip5.IPSL-CM5A-LR.historical.r1i1p1.an.ocn.Omon.density.ver-v20111119.nc

— Reply to this email directly or view it on GitHub https://github.com/eguil/Density_bining/issues/7#issuecomment-56185849.

Eric Guilyardi IPSL/LOCEAN - Dir. Rech. CNRS Tour 45, 4eme, piece 406 UPMC, case 100 4 place Jussieu, F-75252 Paris Tel: +33 (0)1 44 27 70 76 Prof. Eric Guilyardi NCAS Climate Meteorology Department University of Reading Reading RG6 6BB - UK Tel: +44 (0)118 378 8315

             http://ncas-climate.nerc.ac.uk/~ericg
durack1 commented 10 years ago

26Gb is what you had, or is it about right? I think we can pull this down a fair bit by running cleanup operations throughout the code del(variable) ; gc.collect() combos which purge and release memory back to the system

eguil commented 10 years ago

you are not running on crunchy ?

On 19/9/14 16:39, Paul J. Durack wrote:

Ok great, so this is the output.. I'll dump the file on the web somewhere (and email you a link) so you can check it's contents.. This is the output, do timings/memory usage look about right?

--> time chunk (bounds) = 13 / 15 ( 1560 1679 ) IPSL-CM5A-LR [Change units to celsius] CPU of density bining = 103.06 CPU of annual mean compute = 47.26 CPU of interpolation = 53.93 CPU of zonal mean = 48.84 CPU of persistence compute = 403.58 CPU of chunk = 656.74 --> time chunk (bounds) = 14 / 15 ( 1680 1799 ) IPSL-CM5A-LR [Change units to celsius] CPU of density bining = 236.3 CPU of annual mean compute = 73.74 CPU of interpolation = 49.76 CPU of zonal mean = 33.24 CPU of persistence compute = 448.17 CPU of chunk = 841.29 [ Time stamp 18/09/2014 17:42:47 ] Max memory use 26.04918 GB Ratio to grid_nyears 0.198632408277 kB/unit(size_nyears) CPU use, elapsed 9467.16 9511.05632401 Ratio to grid_nyears 6.01581569407 1.e-6 sec/unit(size_nyears) Wrote file: test/cmip5.IPSL-CM5A-LR.historical.r1i1p1.an.ocn.Omon.density.ver-v20111119.nc

— Reply to this email directly or view it on GitHub https://github.com/eguil/Density_bining/issues/7#issuecomment-56185849.

Eric Guilyardi IPSL/LOCEAN - Dir. Rech. CNRS Tour 45, 4eme, piece 406 UPMC, case 100 4 place Jussieu, F-75252 Paris Tel: +33 (0)1 44 27 70 76 Prof. Eric Guilyardi NCAS Climate Meteorology Department University of Reading Reading RG6 6BB - UK Tel: +44 (0)118 378 8315

             http://ncas-climate.nerc.ac.uk/~ericg
durack1 commented 10 years ago

nope, its twin ocean..

eguil commented 10 years ago

that seems a lot. I was closer to 15 Gb On 19/9/14 16:57, Paul J. Durack wrote:

26Gb is what you had, or is it about right.. I think we can pull this down a fair bit by running cleanup operations throughout the code |del(variable) ; gc.collect()| combos which purge and release memory back to the system

— Reply to this email directly or view it on GitHub https://github.com/eguil/Density_bining/issues/7#issuecomment-56188496.

Eric Guilyardi IPSL/LOCEAN - Dir. Rech. CNRS Tour 45, 4eme, piece 406 UPMC, case 100 4 place Jussieu, F-75252 Paris Tel: +33 (0)1 44 27 70 76 Prof. Eric Guilyardi NCAS Climate Meteorology Department University of Reading Reading RG6 6BB - UK Tel: +44 (0)118 378 8315

             http://ncas-climate.nerc.ac.uk/~ericg
eguil commented 10 years ago

faster proc ? On 19/9/14 17:00, Paul J. Durack wrote:

nope, its twin ocean..

— Reply to this email directly or view it on GitHub https://github.com/eguil/Density_bining/issues/7#issuecomment-56188838.

Eric Guilyardi IPSL/LOCEAN - Dir. Rech. CNRS Tour 45, 4eme, piece 406 UPMC, case 100 4 place Jussieu, F-75252 Paris Tel: +33 (0)1 44 27 70 76 Prof. Eric Guilyardi NCAS Climate Meteorology Department University of Reading Reading RG6 6BB - UK Tel: +44 (0)118 378 8315

             http://ncas-climate.nerc.ac.uk/~ericg
durack1 commented 10 years ago

Nope it's a crunchy twin, numpy 1.9.0 and UV-CDAT2.0beta.. Plus some tweaks within code (converting to function etc)..

Yeah with the IPSL-CM5A-LR grid being one of the more coarse ones, I'll have to get that number further down I think.. Track down unnecessary variables etc..

I just need to understand your code a little (alot) better..

eguil commented 10 years ago

well, you can play with the size of the time chuncks (tcdel, see current definition as a function of size of grid X ntimes) = TimeChunkDELta ! This will have a direct impact on the total memory used (it was designed for this).

On 19/9/14 17:40, Paul J. Durack wrote:

Nope it's a crunchy twin, numpy 1.9.0 and UV-CDAT2.0beta.. Plus some tweaks within code (converting to function etc)..

Yeah with the IPSL-CM5A-LR grid being one of the more coarse ones, I'll have to get that number further down I think.. Track down unnecessary variables etc..

I just need to understand your code a little (alot) better..

— Reply to this email directly or view it on GitHub https://github.com/eguil/Density_bining/issues/7#issuecomment-56194352.

Eric Guilyardi IPSL/LOCEAN - Dir. Rech. CNRS Tour 45, 4eme, piece 406 UPMC, case 100 4 place Jussieu, F-75252 Paris Tel: +33 (0)1 44 27 70 76 Prof. Eric Guilyardi NCAS Climate Meteorology Department University of Reading Reading RG6 6BB - UK Tel: +44 (0)118 378 8315

             http://ncas-climate.nerc.ac.uk/~ericg
durack1 commented 10 years ago

Ok great and compression is buying us 2/3 reduction in file size, so 601.5MB without compression, 214.4MB with..