NOAA-OCM / QNSPECT

QGIS Plugin for NOAA Nonpoint Source Pollution and Erosion Comparison Tool (NSPECT)
GNU General Public License v2.0
7 stars 5 forks source link

Division by zero error in comparison tools #100

Closed ar-siddiqui closed 2 years ago

ar-siddiqui commented 2 years ago

Latest QNSPECT version

Similar issues do not exist

What is the bug or the crash?

Division by zero should be gracefully handled in comparison as well.

A strategy is needed to address division by zero for both direct and percent comparison. @DaveEslinger how should we handle them?

Steps to reproduce the issue

Run comparison tool.

Screenshots and Attachements

No response

Versions

3.22.4

Additional context

No response

DaveEslinger commented 2 years ago

Is there a reason not to implement the same numpy approach used for concentrations? That seems a good approach.

ar-siddiqui commented 2 years ago

Is there a reason not to implement the same numpy approach used for concentrations? That seems a good approach.

If we used the same approach of assigning 0 to division by zero cells we will encounter following issue:

Lets say we have raster A = [[0,2]] and B = [[1,2]]

When we calculate percent difference here expression="100 * ((A - B) / A)"

We would endup getting resulting raster as [[0, 0]] which is not correct.

DaveEslinger commented 2 years ago

Correct, that should give [[NODATA, 0]], I think. That might also be what we want for concentration. "0" implies there is runoff, but no pollutant, when in reality, there is no runoff at all. Therefore, when accumulated runoff = 0, we probably want to return NODATA.

ar-siddiqui commented 2 years ago

Correct, that should give [[NODATA, 0]], I think.

This will make the resulting raster have holes in it, I think a better approach would be [[inf, 0]] I can see if there are ways to specify that. I have seen this before in QGIS.

That might also be what we want for concentration. "0" implies there is runoff, but no pollutant, when in reality, there is no runoff at all. Therefore, when accumulated runoff = 0, we probably want to return NODATA.

Actually, this is what was happening before, but this was leading to big holes in the middle of rasters, not looking good. One can argue when there is no runoff, then there is no pollutant as well hence 0 is the correct value for those cells. Infact if there is no runoff there will always be 0 pollutant as well by design of the code.

DaveEslinger commented 2 years ago

If we go with the [[inf, 0]] option, do you know what that does to the color ramps? Do they try to include that high value? That was the advantage of the nodata approach we used in OpenNSPECT. Yes, there were gaps in data sets, but those gaps were meaningful and folks could interpret them.

ar-siddiqui commented 2 years ago

If we go with the [[inf, 0]] option, do you know what that does to the color ramps? Do they try to include that high value? That was the advantage of the nodata approach we used in OpenNSPECT. Yes, there were gaps in data sets, but those gaps were meaningful and folks could interpret them.

I will look into it and get back to you.

ar-siddiqui commented 2 years ago

The QGIS legend would include inf, so that option definitely goes out. Currently, it raises runtime warnings for division by zero. This actually is good if we want to return no data for division by zero, at least this way, users get to know they have 0 values in Raster B.

Without the warning, it is possible to assume that there is no difference between A and B, when in fact there is a difference and B has 0 values for cells with a difference. With warning, users get some sort of notice. Closing as there is nothing else that can be opened. Reopen if necessary.