qgis / QGIS

QGIS is a free, open source, cross platform (lin/win/mac) geographical information system (GIS)
https://qgis.org
GNU General Public License v2.0
10.53k stars 2.99k forks source link

add grid values to points gives wrong results #35687

Closed ecodiv closed 2 years ago

ecodiv commented 4 years ago

Describe the bug

I used from the processing toolbox the 'add grid values to points' function with as resampling option 'nearest neighbor', and with as input a point layer and a raster layer with integer values. The resulting column with raster values shows decimal numbers, not the exact integer raster value. When I run the function in SAGA directly, the column contains the exact raster values.

QGIS and OS versions

3.12.1 on Windows (installed with Osgeo4W

Additional context

Log of function


C:\OSGEO4~1\bin>call saga_cmd shapes_grid "Add Grid Values to Points" -SHAPES "C:/Users/brp/AppData/Local/Temp/processing_wjqTfe/6601327d6d9a43199caeead18a394363/SHAPES.shp" -GRIDS "C:/Users/brp/AppData/Local/Temp/processing_wjqTfe/6207aad29fad45aa94554d79463c8593/bio12.sgrd" -RESAMPLING 0 -RESULT "C:/Users/brp/AppData/Local/Temp/processing_wjqTfe/306d704010524140aed1ef0e445fc93a/RESULT.shp"


SAGA Version: 2.3.2 (64 bit)


library path: C:\OSGEO4~1\apps\saga-ltr\modules\ library name: shapes_grid library : Grid Tools tool : Add Grid Values to Points author : O.Conrad (c) 2003 processors : 8 [8]


Load shapes: C:/Users/brp/AppData/Local/Temp/processing_wjqTfe/6601327d6d9a43199caeead18a394363/SHAPES.shp...

Load grid: C:/Users/brp/AppData/Local/Temp/processing_wjqTfe/6207aad29fad45aa94554d79463c8593/bio12.sgrd...

Parameters Points: SHAPES Grids: 1 object (bio12) Result: Result Resampling: Nearest Neighbour

gioman commented 4 years ago

@ecodiv QGIS just acts as a GUI for SAGA, it does not add anything other than running SAGA commands, so if you a difference when running a SAGA tool in QGIS and native SAGA there must be a difference in the commands that have been created.

Can you post the exact commands that you are running in native SAGA and compare with the ones in QGIS logs? Thanks.

ecodiv commented 4 years ago

@gioman Thanks for looking at this. I am probably overlooking something, but from what I can see below, it looks like in both cases the same command is issued?

Saga

[2020-04-10/21:05:33] Executing tool: Add Grid Values to Points Parameters Points: testpoints Grids: 1 object (testraster) Result: Result Resampling: Nearest Neighbour

QGIS

tool : Add Grid Values to Points Parameters Points: SHAPES Grids: 1 object (testraster) Result: Result Resampling: Nearest Neighbour

See attached image for attribute tables of point file generated by SAGA and by QGIS image

gioman commented 4 years ago

I might well overlook something, but from what I can see below, it looks like in both cases the same command is issued

this are not the commands.

In QGIS log you have too look for something like:

io_gdal 0 -TRANSFORM 1 -RESAMPLING 3 -GRIDS "/tmp/processing_18008c2c8b8943dab2b94121ed7d27f1/processing_54353a81f7f348c6868268c5c034a9ed/a0b305afd548494f803370edb5ab5381/madeiramdt100.sgrd" -FILES "/home/giovanni/Downloads/madeira_mdt_100.tif"

shapes_grid "Add Grid Values to Points" -SHAPES "/tmp/processing_18008c2c8b8943dab2b94121ed7d27f1/processing_54353a81f7f348c6868268c5c034a9ed/dc4f4200634e4a5fb6b4a6cc9b78aa4c/SHAPES.shp" -GRIDS "/tmp/processing_18008c2c8b8943dab2b94121ed7d27f1/processing_54353a81f7f348c6868268c5c034a9ed/a0b305afd548494f803370edb5ab5381/madeiramdt100.sgrd" -RESAMPLING 0 -RESULT "/home/giovanni/Desktop/grid_values_to_points.shp"

If you run the same commands from the SAGA command line and you get get the same indentical results as in QGIS then is the prove that QGIS is doing nothing wrong and that what you observe depends on some difference (in "io_gdal" and/or "Add Grid Values to Points") that the SAGA gui introduces.

ecodiv commented 4 years ago

Ah, yes. See below the saga commands. When running these on the SAGA command line, I am getting the identical results as in QGIS. The problem lies with the first step, the export with io_gdal of the tiff file as sgrd raster. The output is a raster with interpolated values, rather than the original values. I guess this is because RESAMPLING option is set to 3 (should be 0 to use the nearest neighbor resampling).

Export to sgrd with io_gdal

io_gdal 0 -TRANSFORM 1 -RESAMPLING 3 -GRIDS "C:/Users/brp/AppData/Local/Temp/processing_nRuNRn/2860270b415b48fc88efd3c3cbbddbbe/testraster.sgrd" -FILES "C:/Users/brp/Desktop/testraster.tif"

Function 'add grid values to points'

saga_cmd.exe shapes_grid "Add Grid Values to Points" -SHAPES "C:/Users/brp/Desktop/testpoints.shp" -GRIDS "C:/Users/brp/AppData/Local/Temp/processing_nRuNRn/2860270b415b48fc88efd3c3cbbddbbe/testraster.sgrd" -RESAMPLING 0 -RESULT "C:/Users/brp/Desktop/RESULT.shp"

gioman commented 4 years ago

The problem lies with the first step, the export

@ecodiv do you mean "import"?

The output is a raster with interpolated values, rather than the original values. I guess this is because RESAMPLING option is set to 3 (should be 0 to use the nearest neighbor resampling).

I feel that we already have a similar discussion about this matter. @PedroVenancio @alexbruy do you remember why we choose to import rasters in the SAGA format using B-Spline rather than Nearest Neighbour?

alexbruy commented 4 years ago

If I'm not wrong, this depends on SAGA version, at some point they have changed default interpolation method and we adjusted accordingly.

gioman commented 4 years ago

If I'm not wrong, this depends on SAGA version, at some point they have changed default interpolation method and we adjusted accordingly.

@alexbruy thanks. You are right, if no resampling is specified it uses "Resampling: B-Spline Interpolation".

@ecodiv does this answer your question?

ecodiv commented 4 years ago

@gioman If I understand correctly, the use of io_gdal is an intermediate step, needed to export the input geotif to a sgrd raster, the format used by SAGA. Next, the function Add Grid Values to Points uses this sgrd raster as input.

And that is where the problem lies I guess. In QGIS one can set the option 'resampling' to nearest neighbour when using the Add Grid Values to Points. But QGIS is actually running two functions in the background, io_gdal and next add grid values to points.

I would expect that when running add grid values to points the same resampling method is used for the intermediate step (io_gdal) as is used for the final step (add grid values to points). Instead, the user-defined resampling method is only used for the second step, but ignored, it seems, when running the intermediate step (io_gdal).

Edit: when I think about it a bit more, I am wondering if there is any reason to use something else than nearest neighbour for this intermediate step. This intermediate step is just because SAGA cannot work directly with geotifs. So in most cases you would just like the sgrd raster to have exactly the same values (and properties) as the geotif, irrespective of what for resampling method one would want to use for the actual add grid values to points function.

gioman commented 4 years ago

@ecodiv not sure if I follow you.

It is simple, for every SAGA tool that has a raster input we must do one io_gdal operation: this is done transparently in QGIS/Processing. And as stated above we use the default SAGA resampling method. This could be eventually be parametrized with a Processing/SAGA option (or maybe at a specific tool level), but that would be a feature request.

ecodiv commented 4 years ago

This is perhaps done transparently from a coding perspective, but from a (native / regular) user perspective less so. My point is, there is no real reason why a user would expect that the values that are written to the point file are in fact different from the values of the input raster he/she defined.

Perhaps logical in hindsight of you look at the logfile. However, from a user perspective, you expect that if you give input values to a function, that the function uses exactly those values, and not first slightly changes those values before using them (just like you would not expect that if you use a table with values as input for a regression analysis in R or Excel, the values in the table are first slightly changes before being used as input in the regression).

I do not mean to be critical or pedantic, and I do appreciate the complexities of coding this. I just hope to clarify why this behaviour surprised me (and some of my colleagues I asked before), and why from a user's point of view, I would consider this to be a bug. But independent whether you call this a bug or feature request, I really hope this can be parametrized / reconsidered. Should I open a feature request for that? And if so, should this be for this specific function? I guess this is relevant for e.g., 'add raster values to feature' as well.

nyalldawson commented 4 years ago

I propose that we flag this saga algorithm with the "known issues" flag as a result, and suggest to users to use the native point sampling algorithm instead. That is faster, better supported on qgis, and has no known issues. Sound ok?

gioman commented 4 years ago

I would prefer document that in QGIS/SAGA we use B-Spline as default resampling method and have a feature request filed to have a new parameter for SAGA to allow chose the resampling method (eventually someone could be interested in having this option implemented).

nyalldawson commented 4 years ago

@gioman let's do both then. Document the issue, flag it as "known issue" ( because right now, it does have serious issues which affect quality of analysis), and then revert this tag if/when someone fixes the underlying issue.

gioman commented 4 years ago

@nyalldawson ok, but this does not affects only this tool, likely. I wonder if we should not just change the resampling we use in io_gdal to Nearest Neighbor. I'm pretty sure we had this discussion in Redmine ticket. Maybe @PedroVenancio has more insights. Let's wait his feedback before taking any action.

roya0045 commented 4 years ago

@nyalldawson does the kde alg has this tag out of curiosity?

nyalldawson commented 4 years ago

No -- we only use it for 3rd party algs were we cannot change the results. For others we should fix the problem itself. I'll look in kde next bug fixing round

roya0045 commented 4 years ago

@nyalldawson ah right, I wasn't aware of that politic. Though I don't think I ever noticed the tag in the processing tab, would it be a good idea to eventually add some icon?

nyalldawson commented 4 years ago

They are hidden by default. And if you enable display of them (via options), they get highlighted in red

PedroVenancio commented 4 years ago

Hi all,

This kind of issues in SAGA is somehow recurrent.

Please see this one I opened some time ago, with exact same issue, but with Raster Calculator: https://github.com/qgis/QGIS/issues/30193

So, as @gioman said, it's not just a specific algorithm problem.

And as @alexbruy said, the resampling method was changed from Nearest Neighbour to B-Spline because it was/is the defaut in SAGA 2.3.2, that we still use in QGIS: https://github.com/qgis/QGIS/pull/4743 www.saga-gis.org/saga_tool_doc/2.3.0/io_gdal_0.html

In recent versions, SAGA changed again the defaut to Nearest Neighbour: http://www.saga-gis.org/saga_tool_doc/7.6.2/io_gdal_0.html

In QGIS 2.18 this does not happened, because we used the resampling method as a variable: https://github.com/qgis/QGIS/blob/release-2_18/python/plugins/processing/algs/saga/SagaAlgorithm.py#L348

Ideally, as @ecodiv said, the Resampling Method of io_gdal should match the Resampling Method selected by the user in the algorithm, as it was in QGIS 2. It is not intuitive to have the possibility to choose the Resampling Method in the algorithm window, and the B-Spline be used in background just for input one single raster. When the algorithm does not allow the user to choose the Resampling Method, then we should use Nearest Neighbour. Everything like in QGIS 2.

Or in alternative, introduce the possibility to choose the default SAGA Resampling Method in SAGA Processing Options.

Until there is a more robust solution, I think the Nearest Neighbour Resampling Method should again be used by default, as in the most recent SAGA versions.

So this issue is related / duplicated to https://github.com/qgis/QGIS/issues/30193

ecodiv commented 4 years ago

As @PedroVenancio also suggests, when you have one single raster as input, there is no reason to resampling the layer. With multiple raster layers it makes sense, and this is mentioned in the QGIS documentation as Pedro already highlighted in #30193. And then it ideally should be possible for the user to select the resampling method (as a common setting in the processing option, or on a function base). I agree with @PedroVenancio and @gioman that as an intermediate solution, the nearest neighbour resampling method should be used by default for on-the-fly / in-the-background resampling.

Till it is possible to set the default to nearest neighbour I think the suggestion by @nyalldawson is a good one. This is in fact what I told my students; be aware there is an issue, and for now better use the QGIS native point sampling algorithm instead (the reason I wanted to use the SAGA algorithm is that it has some clear advantages, such as sampling multiple rasters in once, and how it handles column names, but that is maybe something for a feature request :-)).

Thanks all for taking this up, impressive again how quick and inclusive these kind of things are discussed.

gioman commented 4 years ago

Default resampling changed here: https://github.com/qgis/QGIS/pull/35726

alexbruy commented 4 years ago

See also #34385.

github-actions[bot] commented 3 years ago

The QGIS project highly values your report and would love to see it addressed. However, this issue has been left in feedback mode for the last 14 days and is being automatically marked as "stale". If you would like to continue with this issue, please provide any missing information or answer any open questions. If you could resolve the issue yourself meanwhile, please leave a note for future readers with the same problem and close the issue. In case you should have any uncertainty, please leave a comment and we will be happy to help you proceed with this issue. If there is no further activity on this issue, it will be closed in a week.

Pedro-Murteira commented 2 years ago

@ecodiv Hello, is this issue still valid on recent releases?

github-actions[bot] commented 2 years ago

The QGIS project highly values your report and would love to see it addressed. However, this issue has been left in feedback mode for the last 14 days and is being automatically marked as "stale". If you would like to continue with this issue, please provide any missing information or answer any open questions. If you could resolve the issue yourself meanwhile, please leave a note for future readers with the same problem and close the issue. In case you should have any uncertainty, please leave a comment and we will be happy to help you proceed with this issue. If there is no further activity on this issue, it will be closed in a week.

github-actions[bot] commented 2 years ago

While we hate to see this happen, this issue has been automatically closed because it has not had any activity in the last 42 days despite being marked as feedback. If this issue should be reconsidered, please follow the guidelines in the previous comment and reopen this issue. Or, if you have any further questions, there are also further support channels that can help you.