cloudsci / cloudmetrics

Toolkit for computing 15+ metrics characterising 2D cloud patterns
16 stars 8 forks source link

Supporting nans in input fields #83

Open martinjanssens opened 11 months ago

martinjanssens commented 11 months ago

We should support input fields with missing data. I'd argue that by default, if a nan is encountered, cloudmetrics should exclude that pixel from the metric calculations as much as possible, that is, it's neither cloud, nor clear sky, but still try to return a metric when this is a reasonable thing to do (of course, the question is what is reasonable).

We need to do this, because currently most (but not all) metrics break upon encountering a nan in an input field. That would at least have been consistent, but right now, it's really easy to zero-fill nans and just compute the metrics anyway. This will work for some metrics, but some will return crap, and some should not work. Choosing which metrics should handle nans and how is going to be a bit subjective of course, but we should at a minimum be explicit in what we support.

I'd be in favour of the following, but am happy to debate:

I'm happy to have a go at this, if we largely agree :)

gmandorl commented 11 months ago

Hi @martinjanssens ,

Thank you for bringing up this question. Providing a universal treatment for NaN values can indeed be challenging. I like your suggestions, and I agree with all of them.

Additionally, we might consider incorporating two input parameters when calling the metrics: one parameter to indicate the allowance of NaN values in the images and another parameter to set a maximum threshold for the fraction of NaNs