PIK-LPJmL / LandInG

The Land Input Generator (LandInG) contains a collection of scripts to derive basic input datasets for terrestrial ecosystem models from diverse and partially conflicting data sources.
https://doi.org/10.5194/gmd-16-3375-2023
GNU Affero General Public License v3.0
7 stars 1 forks source link

Error in { : task 699 failed - "Loop 0 is not valid: Edge 0 crosses edge 71" Calls: %dopar% -> <Anonymous> #3

Closed AdrienDams closed 2 months ago

AdrienDams commented 2 months ago

I am trying to simply run GADM with the default grid and the last GADM files (4.10).

While running 5_map_districts_to_grid_intersection.R, I have this error:

...
** Processing ZAF **
** Processing ZMB **
** Processing ZMB **
** Processing ZNC **
** Processing ZWE **
Error in { : 
  task 699 failed - "Loop 0 is not valid: Edge 0 crosses edge 71"
Calls: %dopar% -> <Anonymous>
In addition: There were 50 or more warnings (use warnings() to see the first 50)
Execution halted

Any idea what I should do?

sostberg commented 2 months ago

Which step is this exactly? Step 5 should be 5_map_districts_to_grid_intersection.R, whereas step 3 would be 3_map_countries_to_grid_collection.R. Since the error message is for a %dopar% loop I'm assuming it is either step 2 or step 5. Unfortunately, the %dopar% loops can be a little bit hard to debug. The error message sounds like there is a problem with the polygon geometry.

AdrienDams commented 2 months ago

My mistake, I am running 5_map_districts_to_grid_intersection.R now. Step 3 and 4 are done

sostberg commented 2 months ago

Do you run the script in parallel mode or on sequential mode?

AdrienDams commented 2 months ago

Sequential right now, is it better to try on parallel? It was quite fast like this

sostberg commented 2 months ago

According to the error message the error occurred in task 699. I am not sure if %dopar% aborted right away when that error occurred or whether it finished all other tasks and only reported the error at the end. You could compare the number of files in the by_country_districts directory with the number of files in grid_intersection_*_districts

AdrienDams commented 2 months ago

Only URY2* files are missing

sostberg commented 2 months ago

Okay. So this means that the error occurred in the shape file for URY2.

I would propose to run the script again but stop execution before the intersection_loop in line 241. You can do this by adding a stop("Test") right before line 241.

Then find the index for "URY2" in country_shapes . It may be the only item left in country_shapes. Set cindex to that value (1 if URY2 is the only item) and manually run the code within the %dopar% loop (line 246-412). This should allow you to find out exactly which function causes the error.

The code already has a check for invalid geometry and tries to correct that so I am not sure why the code still failed in your case.

sostberg commented 2 months ago

This Github issue may explain a similar issue.

The fix suggested there is to turn off spherical geometry in the sf package by running sf_use_s2(FALSE). Then run st_make_valid() on the corrupted geometry before trying any other spatial operation.

Note that LandInG was developed using an old version of sf (version 0.8–1) which did not include spherical geometry, yet. This functionality was only added in sf version 1. So newer versions of sf may behave slightly differently than the version used for development.

I am not sure if running the URY2 shape with spherical geometry switched off creates any incompatibility with the other country shapes, which have been processed with spherical geometry switched on.

AdrienDams commented 2 months ago

I have tried your solution to use sf_use_s2(FALSE) just for URY2, and also use it to recreate all grid_intersection_30arcmin_districts files. In both cases, step 5 is working without error messages.

However in step 6, I am getting the error message below. Do you think this is linked?

WARNING: ignoring environment value of R_HOME
Linking to GEOS 3.12.1, GDAL 3.8.5, PROJ 9.4.0; sf_use_s2() is TRUE
Loading required package: sp
Linking to liblwgeom 3.0.0beta1 r16016, GEOS 3.12.1, PROJ 9.4.0

Attaching package: ‘lwgeom’

The following object is masked from ‘package:sf’:

    st_perimeter

Please note that rgdal will be retired during October 2023,
plan transition to sf/stars/terra functions using GDAL and PROJ
at your earliest convenience.
See https://r-spatial.org/r/2023/05/15/evolution4.html and https://github.com/r-spatial/evolution
rgdal: version: 1.6-7, (SVN revision 1203)
Geospatial Data Abstraction Library extensions to R successfully loaded
Loaded GDAL runtime: GDAL 3.8.5, released 2024/04/02
Path to GDAL shared files: 
 GDAL does not use iconv for recoding strings.
GDAL binary built with GEOS: TRUE 
Loaded PROJ runtime: Rel. 9.4.0, March 1st, 2024, [PJ_VERSION: 940]
Path to PROJ shared files: /app/spack/0.19.0/opt/spack/linux-almalinux9-x86_64/gcc-11.3.1/proj-8.2.1-54gqm4h7b6cpailljrmtbp63rxln6nc2/share/proj
PROJ CDN enabled: TRUE
Linking to sp version:2.1-4
To mute warnings of possible GDAL/OSR exportToProj4() degradation,
use options("rgdal_show_exportToProj4_warnings"="none") before loading sp or rgdal.
udunits database from /bg/home/damseaux-a/.conda/envs/LandInG/share/udunits/udunits2.xml
Warning message:
GDAL support is provided by the sf and terra packages among others 
*** Script run in /bg/home/damseaux-a/LandInG/gadm ***
Spatial resolution: 30arcmin 
Loading GADM shapes from /bg/home/damseaux-a/LandInG/gadm 
Reading layer `ADM_0' from data source 
  `/bg/home/damseaux-a/LandInG/gadm/gadm_410-levels.gpkg' using driver `GPKG'
Simple feature collection with 263 features and 2 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: -180 ymin: -90 xmax: 180 ymax: 83.65833
Geodetic CRS:  WGS 84
Reading layer `ADM_1' from data source 
  `/bg/home/damseaux-a/LandInG/gadm/gadm_410-levels.gpkg' using driver `GPKG'
Simple feature collection with 3662 features and 11 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: -180 ymin: -55.98 xmax: 180 ymax: 83.65833
Geodetic CRS:  WGS 84
Reading layer `ADM_2' from data source 
  `/bg/home/damseaux-a/LandInG/gadm/gadm_410-levels.gpkg' using driver `GPKG'
Simple feature collection with 47217 features and 13 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: -180 ymin: -55.98 xmax: 180 ymax: 83.11042
Geodetic CRS:  WGS 84
Reloading preprocessed cell list from /bg/home/damseaux-a/LandInG/gadm/celllist_30arcmin_districts.RData 
Warning messages:
1: In grepl("[[:alpha:]_]", headername) :
  unable to translate '<9d>' to a wide string
2: In grepl("[[:alpha:]_]", headername) : input string 28 is invalid
Warning messages:
1: In grepl("[[:alpha:]_]", headername) :
  unable to translate '<9d>' to a wide string
2: In grepl("[[:alpha:]_]", headername) : input string 28 is invalid
Error: 4088 cells in /bg/home/damseaux-a/LandInG/gadm/grid_gadm_30arcmin.bin have no GADM information in cell_list.
They are not associated to gadm_no_land in /bg/home/damseaux-a/LandInG/gadm/cow_gadm_30arcmin.bin either.
Execution halted
AdrienDams commented 2 months ago

Maybe I should try to use GADM-3.6 instead, however their server are not answer for old data download https://gadm.org/download_world36.html.

sostberg commented 2 months ago

Searching for the error message in the code, there is a comment suggesting a mismatch between the results from step 1 - step 3 and the results from step 4 - step 6. However, I cannot guess what exactly went wrong without rerunning what you did.

AdrienDams commented 2 months ago

I started from scratch and now I only have 6 cells instead of 4088 as you can see in the log step6.txt. Can I just ignore the message or this will cause a problem after? step1.txt step2.txt step3.txt step4.txt step5.txt step6.txt

sostberg commented 2 months ago

Just to confirm: Did you run 5_map_districts_to_grid_intersection.R with sf_use_s2(FALSE) for all polygons? If so, you should probably also run 2_map_countries_to_grid_intersection.R with sf_use_s2(FALSE).

When doing some tests with different versions of the sf package (some with spherical geometry, some without) I noticed that this can affect the number of cells that are detected as containing land. The step-6 script compares its results with the grid and country codes determined in step 1-3.

On a side note: If you plan to use CRU climate data for your model simulations you should be aware that the GADM-derived grid at 0.5° does not fully match the CRU grid. GADM includes some grid cells that are missing in CRU (e.g. the Caspian Sea). On the other hand, CRU includes some grid cells that contain no land according to GADM. So if you plan to have a grid consistent with CRU you should probably set force_grid <- TRUE in gadm_setup.R and use the provided file gridlist_CRU.csv. If your climate data cover land and ocean cells you can use the grid as determined from GADM. However, if your climate data only cover land cells, you may want to use the land mask from the climate data to determine which cells are included in your grid. This is what the force_grid setting is for.