CDAT / uvcmetrics

metrics aka diagnostics for comparing models with observations or each other
BSD 3-Clause "New" or "Revised" License
3 stars 8 forks source link

Showstopper: Problems with CF-compliance of ERBE and and CERES datasets #36

Open susburrows opened 9 years ago

painter1 commented 9 years ago

Thanks for testing this. I fixed it locally (on my Mac) by restoring a genGenericBounds call which had been in previous versions. I haven't gone through and checked whether anything else needs the same fix. I haven't committed it yet. I'm trying to fix a bug in set 13 - a bit hairy because I have to enhance cdscan.

susburrows commented 9 years ago

thanks for the response! When the fix is available I can check if it works for me.

painter1 commented 9 years ago

I pushed the fix to the repository. I'll push a fix for set 6 shortly. Plot set 13 is going to take a while.

susburrows commented 9 years ago

I am still having this issue after the update. I am trying to come up with a good test case for this. WHile I was doing this, however, I noticed that some of the collections of observational climo files on rhea appear to be incomplete, which appears to be causing some of the issues I am experiencing.

For example, for LEGATES data, the climo file for January is missing, but is needed to compute plot set 8.

susburrows commented 9 years ago

I am now able to produce set8 plots for TREFHT using NCEP data, for which climo files are available for each month. Do you have the monthly climo files that are missing on rhea for some obs sets?

susburrows commented 9 years ago

Now that I have a working test case, I am adding region functionality for set8, and will push this shortly.

painter1 commented 9 years ago

On Rhea, I deleted all the LEGATES files in .../csc121/obs_data/, because there is a complete set of LEGATES files in .../csc121/acme-dev-1-obs-data/.

susburrows commented 9 years ago

OK, thanks — I take it that is a more complete directory of observational datasets, and is the one that I should use for testing?

—Susannah

From: Jeffrey Painter notifications@github.com<mailto:notifications@github.com> Reply-To: UV-CDAT/uvcmetrics reply@reply.github.com<mailto:reply@reply.github.com> Date: Monday, December 22, 2014 at 11:27 AM To: UV-CDAT/uvcmetrics uvcmetrics@noreply.github.com<mailto:uvcmetrics@noreply.github.com> Cc: Susannah Burrows susannah.burrows@pnnl.gov<mailto:susannah.burrows@pnnl.gov> Subject: Re: [uvcmetrics] AMWG plot set 8 does not produce plots (#36)

On Rhea, I deleted all the LEGATES files in .../csc121/obs_data/, because there is a complete set of LEGATES files in .../csc121/acme-dev-1-obs-data/.

— Reply to this email directly or view it on GitHubhttps://github.com/UV-CDAT/uvcmetrics/issues/36#issuecomment-67878535.

painter1 commented 9 years ago
Probably that's a better obs set.  At least it's bigger.  Obs_data
is the directory I put there, and I wasn't trying to be complete -
it was meant as a quick upload for initial testing.  Brian Smith is
the one who put the new set there.
- Jeff

On 12/22/14 12:04 PM, susburrows wrote:

  OK, thanks — I take it that is a more complete directory of
  observational datasets, and is the one that I should use for
  testing?

  —Susannah

  From: Jeffrey Painter
  <notifications@github.com<mailto:notifications@github.com>>

  Reply-To: UV-CDAT/uvcmetrics
  <reply@reply.github.com<mailto:reply@reply.github.com>>

  Date: Monday, December 22, 2014 at 11:27 AM

  To: UV-CDAT/uvcmetrics

uvcmetrics@noreply.github.com<mailto:uvcmetrics@noreply.github.com> Cc: Susannah Burrows susannah.burrows@pnnl.gov<mailto:susannah.burrows@pnnl.gov> Subject: Re: [uvcmetrics] AMWG plot set 8 does not produce plots (#36)

  On Rhea, I deleted all the LEGATES files in .../csc121/obs_data/,
  because there is a complete set of LEGATES files in
  .../csc121/acme-dev-1-obs-data/.

  —

  Reply to this email directly or view it on

GitHubhttps://github.com/UV-CDAT/uvcmetrics/issues/36#issuecomment-67878535. — Reply to this email directly or view it on GitHub.

susburrows commented 9 years ago

OK. I am still having trouble with some plots in set 8. It's possible it's caused by a problem with the observational datasets. Here is a command that fails:

/ccs/home/sburrows/metrics/frontend/diags.py --path /lustre/atlas/scratch/sburrows/cli112/climos/mam3_aerocom_bbff --path2 /lustre/atlas/world-shared/csc121/acme-dev-1-obs-data --package AMWG --set 8 --seasons ANN --filter2 "f_startswith('ERBE')" --vars FLUT --outputdir /lustre/atlas/scratch/sburrows/cli112/diags/mam3_aerocom_bbff/amwg --outputpost _ERBE --xml no

And here is the traceback and error messages:

Traceback (most recent call last): File "/lustre/atlas/world-shared/csc121/uvcdat/2.0.0/bin/cdscan", line 1660, in main(sys.argv) File "/lustre/atlas/world-shared/csc121/uvcdat/2.0.0/bin/cdscan", line 1519, in main raise RuntimeError, "Variable '%s' is duplicated, and is a function of lat or lon: files %s, %s"%illegalvars[0] RuntimeError: Variable 'FLUT' is duplicated, and is a function of lat or lon: files ERBE_02_climo.nc, ERBE_08_climo.nc ERROR: cdscan terminated with 1 This is usually fatal. Frequent causes are an extra XML file in the dataset directory or non-CF compliant input files Traceback (most recent call last): File "/ccs/home/sburrows/metrics/frontend/diags.py", line 543, in

File "/ccs/home/sburrows/metrics/frontend/diags.py", line 126, in run_diagnostics_from_options run_diagnostics_from_filetables( opts1, filetable1, filetable2 ) File "/ccs/home/sburrows/metrics/frontend/diags.py", line 253, in run_diagnostics_from_filetables res = plot.compute(newgrid=-1) # newgrid=0 for original grid, -1 for coarse File "/ccs/home/sburrows/metrics/frontend/uvcdat.py", line 826, in compute return self.results(newgrid) File "/ccs/home/sburrows/metrics/frontend/uvcdat.py", line 828, in results return self._results(newgrid) File "/ccs/home/sburrows/metrics/packages/amwg/amwg.py", line 1626, in _results results = plot_spec._results(self, newgrid) File "/ccs/home/sburrows/metrics/frontend/uvcdat.py", line 843, in _results value = self.reduced_variables[v].reduce(None) File "/ccs/home/sburrows/metrics/computation/reductions.py", line 2137, in reduce filename = self.get_variable_file( self.variableid ) File "/ccs/home/sburrows/metrics/computation/reductions.py", line 2106, in get_variable_file xml_name = run_cdscan( fam, famfiles, cache_path ) File "/ccs/home/sburrows/metrics/computation/reductions.py", line 1939, in run_cdscan raise Exception("cdscan failed - %s" %cdscan_line) Exception: cdscan failed - cdscan -q -x /tmp/ERBE_cse0b22eb6696d99cfbee6d88d961242d7.xml -e time.units="months since 1985-02-01 00:00:00" /lustre/atlas/world-shared/csc121/acme-dev-1-obs-data/ERBE_01_climo.nc /lustre/atlas/world-shared/csc121/acme-dev-1-obs-data/ERBE_02_climo.nc /lustre/atlas/world-shared/csc121/acme-dev-1-obs-data/ERBE_03_climo.nc /lustre/atlas/world-shared/csc121/acme-dev-1-obs-data/ERBE_04_climo.nc /lustre/atlas/world-shared/csc121/acme-dev-1-obs-data/ERBE_05_climo.nc /lustre/atlas/world-shared/csc121/acme-dev-1-obs-data/ERBE_06_climo.nc /lustre/atlas/world-shared/csc121/acme-dev-1-obs-data/ERBE_07_climo.nc /lustre/atlas/world-shared/csc121/acme-dev-1-obs-data/ERBE_08_climo.nc /lustre/atlas/world-shared/csc121/acme-dev-1-obs-data/ERBE_09_climo.nc /lustre/atlas/world-shared/csc121/acme-dev-1-obs-data/ERBE_10_climo.nc /lustre/atlas/world-shared/csc121/acme-dev-1-obs-data/ERBE_11_climo.nc /lustre/atlas/world-shared/csc121/acme-dev-1-obs-data/ERBE_12_climo.nc

painter1 commented 9 years ago

Yes, in both ERBE_02_climo.nc and ERBE_08_climo.nc, the time axis has the single value of time=25. That's not the only such duplication; it's just the first one encountered. It violates CF compliance, as well as plain self-consistency (the filenames say that the files are for different times, but the time variable says that they are for the same times.)

We want to support that particular file set, so I'll think about how best to work around the problem. But it's not a bug in the diagnostics.

susburrows commented 9 years ago

Can we just fix this in the dataset, rather than adding another special case to the code?

painter1 commented 9 years ago

I wish we could. I hate to work around other peoples' bugs by messing up our new code with special-case workarounds. I will not do it when the bug is in a code or workflow which is currently being developed.

But I consider the obs files to be too established. Lots of people have copies and will use their own copies. I don't know whether new versions are being produced, but not necessarily.

I've already written a lot of such garbage workaround code - all for "standard" obs data.

susburrows commented 9 years ago

I am also getting an error that appears to be related to non CF-compliant time units in the CERES dataset in this command:

/ccs/home/sburrows/metrics/frontend/diags.py --path /lustre/atlas/scratch/sburrows/cli112/climos/mam3_aerocom_bbff --path2 /lustre/atlas/world-shared/csc121/acme-dev-1-obs-data --package AMWG --set 8 --seasons ANN --filter2 "f_startswith('CERES2')" --vars FLUT --outputdir /lustre/atlas/scratch/sburrows/cli112/diags/mam3_aerocom_bbff/amwg --outputpost _CERES2 --xml no

Error messages/ traceback: Traceback (most recent call last): File "/lustre/atlas/world-shared/csc121/uvcdat/2.0.0/bin/cdscan", line 1660, in main(sys.argv) File "/lustre/atlas/world-shared/csc121/uvcdat/2.0.0/bin/cdscan", line 1188, in main startindex = timeindex(vartime[0], vartime.units, referenceTime, referenceDelta, calendar) File "/lustre/atlas/world-shared/csc121/uvcdat/2.0.0/bin/cdscan", line 262, in timeindex tval = cdtime.reltime(value, units) Cdtime error: Invalid relative time units ERROR: cdscan terminated with 1 This is usually fatal. Frequent causes are an extra XML file in the dataset directory or non-CF compliant input files

Traceback (most recent call last): File "/ccs/home/sburrows/metrics/frontend/diags.py", line 538, in run_diagnostics_from_options(o) File "/ccs/home/sburrows/metrics/frontend/diags.py", line 126, in run_diagnostics_from_options run_diagnostics_from_filetables( opts1, filetable1, filetable2 ) File "/ccs/home/sburrows/metrics/frontend/diags.py", line 253, in run_diagnostics_from_filetables res = plot.compute(newgrid=-1) # newgrid=0 for original grid, -1 for coarse File "/ccs/home/sburrows/metrics/frontend/uvcdat.py", line 826, in compute return self.results(newgrid) File "/ccs/home/sburrows/metrics/frontend/uvcdat.py", line 828, in results return self._results(newgrid) File "/ccs/home/sburrows/metrics/packages/amwg/amwg.py", line 1626, in _results results = plot_spec._results(self, newgrid) File "/ccs/home/sburrows/metrics/frontend/uvcdat.py", line 843, in _results value = self.reduced_variables[v].reduce(None) File "/ccs/home/sburrows/metrics/computation/reductions.py", line 2137, in reduce filename = self.get_variable_file( self.variableid ) File "/ccs/home/sburrows/metrics/computation/reductions.py", line 2106, in get_variable_file xml_name = run_cdscan( fam, famfiles, cache_path ) File "/ccs/home/sburrows/metrics/computation/reductions.py", line 1939, in run_cdscan raise Exception("cdscan failed - %s" %cdscan_line) Exception: cdscan failed - cdscan -q -x /tmp/CERES2_cs3112501643c8b16d8dd23a1fe0086628.xml /lustre/atlas/world-shared/csc121/acme-dev-1-obs-data/CERES2_01_climo.nc /lustre/atlas/world-shared/csc121/acme-dev-1-obs-data/CERES2_02_climo.nc /lustre/atlas/world-shared/csc121/acme-dev-1-obs-data/CERES2_03_climo.nc /lustre/atlas/world-shared/csc121/acme-dev-1-obs-data/CERES2_04_climo.nc /lustre/atlas/world-shared/csc121/acme-dev-1-obs-data/CERES2_05_climo.nc /lustre/atlas/world-shared/csc121/acme-dev-1-obs-data/CERES2_06_climo.nc /lustre/atlas/world-shared/csc121/acme-dev-1-obs-data/CERES2_07_climo.nc /lustre/atlas/world-shared/csc121/acme-dev-1-obs-data/CERES2_08_climo.nc /lustre/atlas/world-shared/csc121/acme-dev-1-obs-data/CERES2_09_climo.nc /lustre/atlas/world-shared/csc121/acme-dev-1-obs-data/CERES2_10_climo.nc /lustre/atlas/world-shared/csc121/acme-dev-1-obs-data/CERES2_11_climo.nc /lustre/atlas/world-shared/csc121/acme-dev-1-obs-data/CERES2_12_climo.nc

painter1 commented 9 years ago

Yes, this is a crazy data problem again. This time it's about units which makes perfect sense to we humans, but not to udunits or any other automated system.

Some time ago I had put in a special-case check for CERES time units, which are "month: 1=Jan, ..., 12=Dec" It looks like I'll need to add one for CERES2 time units, which are "1=Jan, 2=Feb,..., 12=Dec, 13=Ann, 14=DJF,..., 17=SON"

susburrows commented 9 years ago

I am closing this issue, since it has been resolved. The problem was caused by testing with a data directory that had some missing and/or corrupted obs files.

susburrows commented 9 years ago

I'm reopening this since the problems with the CERES and ERBE datasets still exist.

susburrows commented 8 years ago

I think this is fixed now.