aboskovic21 / STAB

1 stars 0 forks source link

Warning/error from st_read related to Yakima county #8

Open krpaulson opened 1 year ago

krpaulson commented 1 year ago

Is it possible to fix the inconsistency with Yakima county which st_read is complaining about here? We also had some people fail to be able to load these files, and I wonder if that's related to this issue.

library(sf)
#> Linking to GEOS 3.11.0, GDAL 3.5.3, PROJ 9.1.0; sf_use_s2() is TRUE
download.file("http://faculty.washington.edu/jonno/SISMIDmaterial/wacounty.shp", destfile = "wacounty.shp")
download.file("http://faculty.washington.edu/jonno/SISMIDmaterial/wacounty.shx", destfile = "wacounty.shx")
download.file("http://faculty.washington.edu/jonno/SISMIDmaterial/wacounty.dbf", destfile = "wacounty.dbf")
wacounty = st_read(dsn=".", layer="wacounty")
#> Reading layer `wacounty' from data source 
#>   `/private/var/folders/m0/jmq_8j956cz1xcls16mc1l940000gn/T/Rtmpl7hXRv/reprex-31494b2fb059-peppy-flea' 
#>   using driver `ESRI Shapefile'
#> Warning in CPL_read_ogr(dsn, layer, query, as.character(options), quiet, : GDAL
#> Error 1: Sanity check failed when trying to recover from inconsistent .shx/.shp
#> with shape 38
#> Simple feature collection with 39 features and 6 fields (with 1 geometry empty)
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: -124.7312 ymin: 45.5434 xmax: -116.915 ymax: 49.0026
#> CRS:           NA
tail(wacounty)
#> Simple feature collection with 6 features and 6 fields (with 1 geometry empty)
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: -123.7264 ymin: 46 xmax: -117.0384 ymax: 49.0026
#> CRS:           NA
#>                  AreaName AreaKey INTPTLAT  INTPTLNG TotPop90 CNTY
#> 34    WA, Thurston County   53067 46.92512 -122.8275   161238   67
#> 35   WA, Wahkiakum County   53069 46.29340 -123.4280     3327   69
#> 36 WA, Walla Walla County   53071 46.22599 -118.4784    48439   71
#> 37     WA, Whatcom County   53073 48.83375 -121.9001   127780   73
#> 38     WA, Whitman County   53075 46.88668 -117.5191    38775   75
#> 39      WA, Yakima County   53077 46.45564 -120.7389   188823   77
#>                          geometry
#> 34 MULTIPOLYGON (((-123.2006 4...
#> 35 MULTIPOLYGON (((-123.357 46...
#> 36 MULTIPOLYGON (((-119.04 46....
#> 37 MULTIPOLYGON (((-123.026 48...
#> 38 MULTIPOLYGON (((-117.9588 4...
#> 39             MULTIPOLYGON EMPTY

Created on 2023-01-26 with reprex v2.0.2

The error two other people got was like this, although I haven't been able to reproduce it:

Error in CPL_read_ogr(dsn, layer, query, as.character(options), quiet,  : 
  Opening layer failed.
In addition: Warning messages:
1: In CPL_read_ogr(dsn, layer, query, as.character(options), quiet,  :
  GDAL Error 4: Record count in .shx header is -12, which seems
unreasonable.  Assuming header is corrupt.
2: In CPL_read_ogr(dsn, layer, query, as.character(options), quiet,  :
  GDAL Error 4: Failed to open file C:\file\path\redacted\here\wacounty.shp.  It may be corrupt or read-only file accessed in update mode.
krpaulson commented 1 year ago

@edzer It would be useful to hear your thoughts on this issue if you have time -- thanks again.

krpaulson commented 1 year ago

Perhaps related to repair=T option in this maptools function: https://www.rdocumentation.org/packages/maptools/versions/1.1-4/topics/readShapePoly

edzer commented 1 year ago

Did that option in maptools solve your problem?

edzer commented 1 year ago

maptools solves this because it comes with its own customized version of shapelib, it seems. sf uses GDAL, which doesn't seem to be able to handle broken shapefiles of this kind. Since maptools is going to retire this year, my advice is: read it with maptools and repair=TRUE, convert to sf, save as GPKG, and abandon the source with broken shapefiles (and abandon shapefiles altogether).