ioos / compliance-checker

Python tool to check your datasets against compliance standards
http://ioos.github.io/compliance-checker/
Apache License 2.0
108 stars 58 forks source link

CF: Greedy exception catching masks check_missing_data failure #20

Closed oychang closed 10 years ago

oychang commented 10 years ago

The test for valid_coordinate_attribute will always fail as a result of a numpy syntax error causing a ValueError to be thrown and interpreted as a test failure in cf.py. Without the try-except block, we get the following traceback:

Running Compliance Checker on the dataset from: ../datasets/netcdf/sss_rc201401.v3.0cap.nc
Traceback (most recent call last):
  File "cchecker.py", line 24, in <module>
    main()
  File "cchecker.py", line 21, in main
    args.criteria)
  File "/Users/ochang/podaac/compliance-checker/compliance_checker/runner.py", line 39, in run_checker
    score_groups = cs.run(ds, *checker_names)
  File "/Users/ochang/podaac/compliance-checker/compliance_checker/suite.py", line 87, in run
    vals = list(itertools.chain.from_iterable(map(lambda c: self._run_check(c, dsp), checks)))
  File "/Users/ochang/podaac/compliance-checker/compliance_checker/suite.py", line 87, in <lambda>
    vals = list(itertools.chain.from_iterable(map(lambda c: self._run_check(c, dsp), checks)))
  File "/Users/ochang/podaac/compliance-checker/compliance_checker/suite.py", line 31, in _run_check
    val = check_method(ds)
  File "/Users/ochang/podaac/compliance-checker/compliance_checker/cf/cf.py", line 2955, in check_missing_data
    indices = [i for i, x in enumerate(var[:]) if (var._FillValue == x or '--' == x or 'nan' == x or 'NaN' == x)]
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

Versions

Python 2.7.5, numpy 1.8.1, Mac OSX 10.9.3

daf commented 10 years ago

Maybe this is dependent on your file, I can't reproduce - the valid_coordinate_attribute always passes for me. I tried it on numpy 1.8.0 and 1.8.1. I'm also on Python 2.7.6, let me see if I can downgrade and reproduce there.

daf commented 10 years ago

Can't reproduce on 2.7.5 either. Can you send/strip out a dataset that exhibits this behavior? Do you see it on everything (try with one of the files in test-data)?

oychang commented 10 years ago

I sent sss_rc201401.v3.0cap.nc to DFoster at asascience.com. I should mention that this is not the original dataset, but one that has had the coordinates attribute manually added in with netCDF4-python. I'm still getting a hang of the cf standards in regards to this, so this very well might be the source of the problem.

The only dataset in test-data that triggered the same for loop for me was ru07.... But it did not trigger the same exception. Breaking apart the list comprehension, on line 2955, it looks like type(x) == numpy.ma.core.MaskedConstant for ru07, but type(x) == numpy.ma.core.MaskedArray for sss_rc201401.v3.0cap.nc

daf commented 10 years ago

Running on your dataset (Python 2.7.6, numpy 1.8.1, OSX 10.9.3), I get this, which seems a legimate failure (excuse the python literal output, we need to fix that):

var                                    :3:    11/14 :
    sss_cap                            :3:     6/ 9 :
        lat_lon_correct                :3:     0/ 1 :
        valid_coordinate_attibute      :2:     0/ 2 : [u'The coordinate
                                                      attribute is improperly
                                                      defined with the
                                                      coordinate idlat , which
                                                      does not exist', u'The
                                                      coordinate attribute is
                                                      improperly defined with
                                                      the coordinate idlon ,
                                                      which does not exist']

I'll try removing that attribute you added and see if that indeed causes the CC to trip in an unexpected way.

oychang commented 10 years ago

That's the message I was getting before removing the try-except block to inspect the Exception. I was trying to understand why idlat and idlon are not valid coordinates since they are variables that satisfy the requirements to be latitude and longitude, respectively.

daf commented 10 years ago

Ok, poking at what the code is trying to do, I don't think it's handling dimensionality properly. @DanielJMaher, can you poke?

DanielJMaher commented 10 years ago

I'll poke.