ioos / ioos_qc

:ballot_box_with_check: :ocean: IOOS QARTOD and other Quality Control tests implemented in Python
https://ioos.github.io/ioos_qc/
Apache License 2.0
46 stars 27 forks source link

Climatology Test #45

Open sagarkevinrajiv opened 3 years ago

sagarkevinrajiv commented 3 years ago

We are trying to run the climatology test for Sea Temperature data in our dataset using the following config and lines of code:

qc_config_M = {
    'qartod': {
       "climatology_test": {
            "config": [
                {
                    "vspan": [10, 13],
                    "tspan": [1, 3],
                    "period": "month",
                    "zspan": [0, 100]
                },
                {
                    "vspan": [11, 14],
                   "tspan": [4, 6],
                    "period": "month",
                    "zspan": [0, 100]
                },
                {
                    "vspan": [13, 17],
                   "tspan": [7, 9],
                    "period": "month",
                    "zspan": [0, 100]
                },
                {
                    "vspan": [12, 16],
                    "tspan": [10, 12],
                    "period": "month",
                    "zspan": [0, 100]
                }
            ]
       }
    }
}

qc_M = QcConfig(qc_config_M)
variable_name = 'SeaTemperature'
qc_resultsM =  qc_M.run(
                inp=M[variable_name],
                tinp=M.index.values,
                zinp=np.zeros(len(M.index))
            )

On running the code, we get the following index error:

IndexError                                Traceback (most recent call last)
<ipython-input-57-45a511067134> in <module>
     40 
     41 variable_name = 'SeaTemperature'
---> 42 qc_resultsM =  qc_M.run(
     43                 inp=M[variable_name],
     44                 tinp=M.index.values,

~\Anaconda3\lib\site-packages\ioos_qc\config.py in run(self, **passedkwargs)
     84 
     85                     testkwargs = { k: v for k, v in testkwargs.items() if k in valid_keywords }
---> 86                     results[modu][testname] = runfunc(**testkwargs)  # noqa
     87 
     88             if modu == 'qartod' and 'aggregate' in tests:

~\Anaconda3\lib\site-packages\ioos_qc\qartod.py in climatology_test(config, inp, tinp, zinp)
    454     zinp = zinp.flatten()
    455 
--> 456     flag_arr = config.check(tinp, inp, zinp)
    457     return flag_arr.reshape(original_shape)
    458 

~\Anaconda3\lib\site-packages\ioos_qc\qartod.py in check(self, tinp, inp, zinp)
    393             with np.errstate(invalid='ignore'):
    394                 flag_arr[(values_idx & fail_idx)] = QartodFlags.FAIL
--> 395                 flag_arr[(values_idx & ~fail_idx & suspect_idx)] = QartodFlags.SUSPECT
    396                 flag_arr[(values_idx & ~fail_idx & ~suspect_idx)] = QartodFlags.GOOD
    397 

~\Anaconda3\lib\site-packages\numpy\ma\core.py in __setitem__(self, indx, value)
   3378         if _mask is nomask:
   3379             # Set the data, then the mask
-> 3380             _data[indx] = dval
   3381             if mval is not nomask:
   3382                 _mask = self._mask = make_mask_none(self.shape, _dtype)

IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices

We don't face the same issue if we pass hard-coded dates in the tspan parameter. But since the test needs to be run on timeseries data over multiple years, passing dates does not seem a viable option. Also, if we don't pass the period parameter, the error foes away but results show 'test not run' for all values.

Not sure how this issue can be resolved. Any help or suggestions would be much appreciated.

kwilcox commented 3 years ago

@sagarkevinrajiv thanks for the bug report, this does indeed look like an issue with ioos_qc. Could you reproduce this error with a subset of data and set me the input values for inp and tinp?

SagarKevin commented 3 years ago

i did a bit digging myself. from the looks of it, the tests don't run if there are null values present in the dataset, and we get the above mentioned error. could you please direct on how to deal with this issue? is there a way we can run the tests on datasets that do contain null values. there must be something i am missing, as the one of the flags is for 'missing data'.