mapme-initiative / mapme.biodiversity

Efficient analysis of spatial biodiversity datasets for global portfolios
https://mapme-initiative.github.io/mapme.biodiversity/dev
GNU General Public License v3.0
33 stars 7 forks source link

Make people aware that downloaded data might not match portfolio if spatial extent of AOIs was changed. #83

Closed Jo-Schie closed 1 year ago

Jo-Schie commented 2 years ago

Currently the download function only checks whether there is already a folder with user data but it does not check whether data inside matches the actual portfolio. That can become quickly a problem if a user e.g. loads two different datasets (AOIs) and then downloads twice the data. Best fix would be, of course, to automatically check inside the folder, whether the extents match but that takes probably quite some time to implement. So as a quick fix i would suggest to replace messages like

"The following requested resources are already available: treecover2000, lossyear, greenhouse."

with a message like:

"The following requested resources are already available: treecover2000, lossyear, greenhouse in folder xyz. If you changed the spatial AOI please make sure to delete those folders first before downloading the data. If the spatial extent of your AOIs did not change, you may proceed processing the data."

goergen95 commented 2 years ago

We give users the option to initiate a portfolio without directly associating already downloaded resources. Take a look here: https://github.com/mapme-initiative/mapme.biodiversity/blob/46391ed125c534abe978b9e9654cb359e5b35a0a/R/portfolio.R#L37-L39

The way this works is if we initiate a new portfolio with this option set to TRUE, we simply look for the geopackages in the respective directories and add these as resource without further checking their spatial extent. This is only desirable if I am sure that I used that very same portfolio to download the resources in the first place. If the option is set to FALSE, the resources are not added and we have to call get_<resource>() before calculating any indicators. If we do so only those files will be downloaded that are currently missing to match the spatio-temporal extent of the portfolio. After the download a new geopackage will be written reflecting the extent of the newly created portfolio. This basically means that there is no need to delete all the directories once you may decide to create a new portfolio but you can use the same "database" across projects. I absolutely agree that this "feature" is not properly documented and that e.g. the warning message could be improved to inform users about this.

goergen95 commented 1 year ago

Please see the discussion in #130. Except for soilgrid ressource, we actually check if all required files are available, given that add_resources was set to FALSE. I am thus closing in favor of #92 and #91, which also touch on this topic.