Slow netcdf reads with `getslab()` using mpi4py

durack1 commented 6 years ago

Slow reads using mpi4py have been reported for use on the NERSC edison machine, full details are below

From: "Nadeau, Denis"
Date: Wednesday, December 20, 2017 at 2:53 PM
To: "Durack, Paul J.", Michael Wehner
Cc: "Doutriaux, Charles", Harinarayan Krishnan
Subject: Re: Slow "big data" reads in cdms2?

I confirm that there is a slow read issue in parallel.  I am working on it right now.

Denis

On 12/20/17 2:52 PM, Durack, Paul J. wrote:
From the comment at the town hall my impression was that slow file reads were the issue, from
your comment below it’s not clear to me that your comment in the town hall and the issue
described below are identical. Is this persistent regardless of the machine that you’re operating
on?

Denis is currently taking a peek at your script, so I’m sure he’ll reply with some insights.

P

From: Michael Wehner
Date: Wednesday, December 20, 2017 at 2:38 PM
To: "Durack, Paul J."
Cc: "Nadeau, Denis", "Doutriaux, Charles", Harinarayan Krishnan
Subject: Re: Slow "big data" reads in cdms2?

Not sure how this will perform. 
55 files is not all that many files.
But if you do the whole lot of them, it should be a stress.
I think i/o contention has been a problem in the past and was latency dominated.

I can make an example with larger files too, if desired.
m
On Dec 20, 2017, at 2:07 PM, Durack, Paul J. wrote:

Thanks Mike, what “slowness” are you experiencing here as these files are tiny at 78MB each..

I’ve just pulled across the files and the script and will get this to Denis who will take a look.

Cheers,

P

From: Michael Wehner
Date: Wednesday, December 20, 2017 at 11:51 AM
To: "Durack, Paul J."
Cc: "Nadeau, Denis", "Doutriaux, Charles", Harinarayan Krishnan
Subject: Re: Slow "big data" reads in cdms2?

Here is something that should work on edison 

cp /project/projectdirs/m1517/C20C/LBNL/CAM5-1-1degree/All-Hist/est1/v2-0/day/atmos/tas/run040/*
remove these 4 files only because 2014 does not have a full year.
tas_Aday_CAM5-1-1degree_All-Hist_est1_v2-0_run040_20140101-20140930.nc
tas_Aday_CAM5-1-1degree_All-Hist_est1_v2-0_run040_20141001-20150630.nc
tas_Aday_CAM5-1-1degree_All-Hist_est1_v2-0_run040_20150701-20151231.nc
tas_Aday_CAM5-1-1degree_All-Hist_est1_v2-0_run040_20160101-20161231.nc

the execute line is 
python /global/homes/m/mwehner/pyfiles/make_extrema_longrun_parallel.py pr —parallel tas_Aday_CAM5-1-1degree_All-Hist_est1_v2-0_run040*

use 55 processors which is the number of files

If you want a much bigger problem
cp /project/projectdirs/m1517/C20C/LBNL/CAM5-1-1degree/All-Hist/est1/v2-0/day/atmos/tas/*/*
rm *2014*
ls *nc|wc -l

the last command will tell you how many files you have, which is the max number of processors.

On Dec 20, 2017, at 10:13 AM, Durack, Paul J. wrote:

Hi Mike,

Just following up your question about “slow data reads” that you raised at the CMEC townhall at
the AGU meeting. Denis (and Charles) cc’d have been working on CDMS2 updates, and as part of
that Denis has been working on mpi implementations of CDMS2, however haven’t really had a
great test case to get this moving.

If we can determine your particular problem, and then get demo/test data etc so we can 
eproduce this slow read issue, then hopefully we can solve the problems once and for all.

What would we need to do to get things moving, you got some data that we can pull across?

Cheers,

P

pinging @dnadeau4 @HarinarayanKrishnan @doutriaux1 And the offending script: make_extrema_longrun_parallel.py.txt

dnadeau4 commented 6 years ago

If you use a FileVariable, it should get a faster response, for CDMS will make the call directly in 'C'

  from:  (line 91 or so..)
        s=f.getslab(var,tim[b],tim[e])
  to:
        myvar=f[var]  #FileVariable handle
        s=myvar._obj_[:]  #call C directly (return numpy array)
        ...
        sorted=MV.numpy.sort(s,0)  # call numpy directly (no grid)

Note that I got rid of your tim[b], tim[e] which are in "world" coordinates. here I select the whole array since you have 365 days in each file.

I was impressed by your use of "setdimattribute()" which does not seem to be documented.

mfwehner commented 6 years ago

"I was impressed by your use of "setdimattribute()" which does not seem to be documented."

This some ancient cdat code. Legacy stuff.

mfwehner commented 6 years ago

"Note that I got rid of your tim[b], tim[e] which are in "world" coordinates. here I select the whole array since you have 365 days in each file."

This may not be the best of examples as it also does seasonal max. So reading the whole array works for the annual, I don't think it works farther into the script for seasons. The reason it is not such a good example is that it fails for DJF in parallel since it does not have the December.

The way that I run this script serially is to create an xml file with cdscan. Then we just pop into the array and read whatever is needed. In parallel, this would be a disaster as every processor reads every netcdf file in the xml.

durack1 commented 6 years ago

From: Michael Wehner
Date: Wednesday, December 20, 2017 at 4:35 PM
To: "Durack, Paul J."
Cc: "Doutriaux, Charles", Harinarayan Krishnan, "Nadeau, Denis"
Subject: Re: Slow "big data" reads in cdms2?

Paul 
I also have these old examples. We would have to dig out the input files.

https://github.com/UV-CDAT/uvcdat/wiki/Embarrassingly-Parallel-Examples-Run-In-Serial

And this is dated, but reflects what I do still. sbatch instead of aprun, but otherwise the same.

https://github.com/UV-CDAT/uvcdat/wiki/How-to-run-UV-CDAT-in-parallel-at-NERSC

dnadeau4 commented 6 years ago

@mfwehner On 16 cores, you program took 16second to run and my change took 6s to run. You can find it in attachment. make_extrema_longrun_parallel_denis.py.txt

dnadeau4 commented 6 years ago

I also used var[b:e+1,:] since you had already computed it fortim[b] and tim[e]

doutriaux1 commented 6 years ago

I think the point here is that the whole getslab needs to be rethought of. It seems to be slow (probably open/close file too many times). Also it can't handle modern indexing used by numpy (c = a[b>0])

durack1 commented 6 years ago

@doutriaux1 @dnadeau4 on the "rethought" comment, these are also relevant UV-CDAT/uvcdat#1288, UV-CDAT/cdms#55, UV-CDAT/cdms#163

mfwehner commented 6 years ago

@doutriaux1 Your thought on this is particularly relevant if we are opening up a xml file made by cdscan, as all files are being opened by all processors.

CDAT / cdms

Slow netcdf reads with `getslab()` using mpi4py #202