GispoCoding / eis_toolkit

Python library for mineral prospectivity mapping
https://eis-he.eu/
European Union Public License 1.2
26 stars 8 forks source link

Modify vector processing tools #353

Closed nmaarnio closed 7 months ago

nmaarnio commented 8 months ago

Currently, some vector processing tools use base/template raster for getting raster transform (distance_computation, vector_density) and other metadata for the output raster while other tools expect extent and pixel size inputs (idw_interpolation, kriging_interpolation). rasterize_vector allows to use either base raster or resolution, but does not allow manual extent when resolution is provided.

I came to notice this discrepancy recently when I was working on the proxy processing UIs in the QGIS plugin. All of the proxy processing workflows result in raster, and most of them belong to the previously mentioned group of vector processing tools. I think it would be good if all of these tools would expose the same set of output raster settings if it makes sense. Does any of you @nialov @msmiyels @em-t @lehtonenp have opinions and comments about this idea of harmonizing all these tools in respect to setting properties of the output raster?

I'm attaching screenshots of the two type of output raster setting definitions I have been designing in the plugin to give more context on this matter. Feel free to also suggest modifications to the UI if any come to your mind.

Option1: Base raster

Screenshot from 2024-03-15 14-27-40

Option 2: Manual definition

image

nialov commented 8 months ago

For modelling purposes I would assume that all rasters should be (mostly?) equivalent in terms of extent, coordinate system and resolution. If the user needs to manually set up the profile for each function, there is room for error. I would always default to having the user provide a base raster from which the output profile can be extracted. If you want to implement a raster profile generator, such as in your Option 2, it is up to you. It is more than likely that someone will be confused on why they need a raster to generate a raster so I can understand the need for it.

The discrepancy between e.g. kriging and distance_computation in terms of inputs is quite minimal. It is mostly just a matter of if you should pass the raster_profile dictionary or the extracted values from that dictionary. You can see that in _kriging, when collecting the out_meta, the transform has to be calculated explicitly for the output dictionary so that the dictionary matches a rasterio profile. If the input already was a raster profile, this calculation would not be needed. However, when using the raster profile as input, the values need to be extracted using keys which has to be repeated across functions. So explicitly passing just the values of the dictionary would be cleaner in that regard. Anyway, there are pros and cons to both approaches and I have no strong opinion one way or the other.

nmaarnio commented 8 months ago

Yes, it's true that rasters should have the same properties in modeling. If the user ends up with rasters with different grids, we have the unify_rasters tool they can run before modeling. And I agree that base raster is a better approach than manually setting up profile. I would offer both, since I can see cases where people would like to use for example larger pixel size for faster execution. The base raster option could be the default that shows up first.

Perhaps it would be better if the toolkit function parameters would be the needed metadata items instead of the complete meta/profile, although I don't have a strong opinion about this either. Another option would be to include both choices as parameters, and if profile was provided, it overrides any input of the single metadata items.

msmiyels commented 8 months ago

@nmaarnio Would also go with the default base raster option, but only using the spatial attributes (resolution, extent and crs). Things like the nodata value could be pre-defined for a certain tool, but should be adjustable by the user.