Improve FZ filter code - Githubissues

OpenSenseAction / pypwsqc

Python package for quality control (QC) of data from personal weather stations (PWS)

https://pypwsqc.readthedocs.io

BSD 3-Clause "New" or "Revised" License

0 stars 3 forks source link

Improve FZ filter code #11

Closed cchwala closed 2 months ago

cchwala commented 6 months ago

Some ideas for how to improve the FZ filter code:

remove the for-loop
discuss/understand this code https://github.com/OpenSenseAction/pypwsqc/blob/12bdd8919ca9a23a7f5913a6784a7c0e2bf8aa67/src/pypwsqc/flagging.py#L57-L60
add some more test cases or make sure the code is understood well enough so that we know that there are no edge cases that are not yet covered

cchwala commented 5 months ago

Note that some part of the original algorithm is not correctly implemented, see #22. Hence this issue could be integrated into the work to resolve #22.

lepetersson commented 2 months ago

@cchwala

To be discussed for the FZ filter:

Initialization: ref_array and sensor_array are now initialized as arrays with 0/1 for dry/wet timesteps. This means that periods of "NaN" data become zero, and that periods of NaN har flagged as faulty zeros. Which kind of makes sense, but another alternative is that those time stamps remain NaN?

fz_array is initalized as an array with -1 and overwrites the entries with 0 and 1 where applicable, without considering number of stations reporting rainfall (should be above specified threshold n_stat as is done here (this is mentioned in issue #22 which can be closed)

lepetersson commented 2 months ago

@cchwala about logics of the code:

I am not confident that the current implementation covers all possible cases. We should somehow check if none of the statements in the loop are true in certain cases.

Functionality of FZ-filter, from Lotte's paper: "All stations within a range (d) around a given station are selected to compute the median rainfall over the surrounding area. If fewer than nstat neighboring stations with rainfall measurements are available, the median cannot be calculated and the FZ flag is set to −1. The FZ flag is set to 1 if this median rainfall is larger than zero for at least nint time intervals while the station itself reports zero rainfall. The FZ flag remains 1 until the station reports nonzero rainfall".

I tried to disentagle the logics of the current code and arrived at the following (not sure if this is helpful or just confusing, and I don't know why it becomes an image lol)