Nowosad / spData

Datasets for spatial analysis
https://jakubnowosad.com/spData/
62 stars 9 forks source link

depreciation of shapefiles #62

Open Nowosad opened 1 year ago

Nowosad commented 1 year ago

@rsbivand As we discussed last month, several shapefiles exist in the package (https://github.com/Nowosad/spData/tree/master/inst/shapes), that we may consider depreciating. However, I do not know how to do it properly (the depreciation process in R is fairly well documented for functions, but not for external data). Do you know how to do it best?

rsbivand commented 1 year ago

I think we'll have to find ways of creating GPKG for:

where several have no known CRS. Then update the help pages to use the GPKG, maybe adding a note that the shapefiles may be withdrawn at a forthcoming release. A similar message might be added to the package startup messages, but that might wait until the release before they were dropped. We'll need to watch the size of the package.

rsbivand commented 1 year ago

See also: https://github.com/r-spatial/sf/issues/2049

rsbivand commented 4 months ago

@Nowosad may I move on this? Should we retain the shapefiles or replace them? Should I try to instrument revdeps?

Nowosad commented 4 months ago

@rsbivand yes, please! I think we should aim for removing the shapefiles and replacing them with, e.g., gpkgs. Revdeps should be useful to do to check the impact of the change.

Also -- I've noticed that shapefiles are used quite a lot in docs of sf (see https://github.com/r-spatial/sf/blob/main/R/read.R and https://github.com/r-spatial/sf/tree/main/inst/shape). Do you think it would be worthwhile also to depreciate shapefiles there?

rsbivand commented 4 months ago

Re. sf, reasonable, I'll raise an issue there.

rsbivand commented 3 months ago

@Nowosad GPKG created for all shapefiles but baltim which is arguably redundant.

I suggest submitting this soon, passes CMD check --as-cran with 4.4.0 and a recent devel. Once it is on CRAN, I'll start the reverse dependency checks, which will involve scanning source packages for patterns like .shp", package="spData" to see which packages are obviously affected, then writing to those maintainers. Suggested time line: shapefiles to be moved to inst/shapes/shp from inst/shapes 3 months after 2.3.1 is on CRAN, and inst/shapes/shp to be removed 3 months on again. If anyone really needs to read the shapefile, we retain only the needed shapefiles in inst/shapes/shp. Comments, please!

rsbivand commented 3 months ago

The possible "most" dependent packages were:

"apsimx" us_states "bayesTFR" world "bispdep" (shp below) "classInt" jenks71, afcon "echelon" nc.sids "epiR" (shp below) "geonetwork" world "GWmodel" boston "GWnnegPCA" shapes/boston_tracts.shp (split line) gw_nsprcomp.Rd "MainExistingDatasets" world "oceanic" shapes/world.gpkg "PopGenHelpR" world, us_states "R2BayesX" (shp below) "raybevel" us_states "rayrender" us_states "rcartocolor" world "rflexscan" (shp below) "RPyGeo" nz "scgwr" boston "spatialreg" (shp below) "spdep" (shp below) "spgwr" (shp below) "sphet" (shp below) "sqlhelper" congruent, incongruent "TeachingDemos" world, state.vbm "tilemaps" us_states "varycoef" house

but of these only the following (plus GWnnegPCA above) had hits on system(paste0("grep 'shp.*spData' ", dwn[i,1], "/*/*", sep=""):

[[1]]
character(0)
attr(,"status")
[1] 2

[[2]]
character(0)
attr(,"status")
[1] 2

[[3]]
 [1] "bispdep/man/connectivity.map.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)\r"
 [2] "bispdep/man/correlogram.bi.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)\r"  
 [3] "bispdep/man/getis.cluster.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)\r"   
 [4] "bispdep/man/localmoran.bi.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)\r"   
 [5] "bispdep/man/moranbi.cluster.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)\r" 
 [6] "bispdep/man/moranbi.plot.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)\r"    
 [7] "bispdep/man/moranbir.test.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)\r"   
 [8] "bispdep/man/moranbi.test.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)\r"    
 [9] "bispdep/man/moran.cluster.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)\r"   
[10] "bispdep/man/spcorrelogram.bi.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)\r"

[[4]]
character(0)
attr(,"status")
[1] 2

[[5]]
character(0)
attr(,"status")
[1] 1

[[6]]
[1] "epiR/vignettes/epiR_descriptive.Rmd:ncsidsll.sf <- st_read(dsn = system.file(\"shapes/sids.shp\", package = \"spData\")[1])\r"
attr(,"status")
[1] 2

[[7]]
character(0)
attr(,"status")
[1] 2

[[8]]
character(0)
attr(,"status")
[1] 2

[[9]]
character(0)
attr(,"status")
[1] 1

[[10]]
character(0)
attr(,"status")
[1] 2

[[11]]
character(0)
attr(,"status")
[1] 1

[[12]]
character(0)
attr(,"status")
[1] 2

[[13]]
[1] "R2BayesX/man/nbAndGraConversion.Rd:  columbus <- readOGR(system.file(\"shapes/columbus.shp\", package=\"spData\")[1])"
attr(,"status")
[1] 2

[[14]]
character(0)
attr(,"status")
[1] 1

[[15]]
character(0)
attr(,"status")
[1] 2

[[16]]
character(0)
attr(,"status")
[1] 2

[[17]]
[1] "rflexscan/man/choropleth.Rd:sids.shp <- read_sf(system.file(\"shapes/sids.shp\", package=\"spData\")[1])\r"
[2] "rflexscan/R/flexscan.R:#' sids.shp <- read_sf(system.file(\"shapes/sids.shp\", package=\"spData\")[1])\r"  

[[18]]
character(0)
attr(,"status")
[1] 2

[[19]]
character(0)
attr(,"status")
[1] 1

[[20]]
 [1] "spatialreg/man/aple.mc.Rd:wheat <- st_read(system.file(\"shapes/wheat.shp\", package=\"spData\")[1], quiet=TRUE)"               
 [2] "spatialreg/man/aple.plot.Rd:wheat <- st_read(system.file(\"shapes/wheat.shp\", package=\"spData\")[1], quiet=TRUE)"             
 [3] "spatialreg/man/aple.Rd:wheat <- st_read(system.file(\"shapes/wheat.shp\", package=\"spData\")[1], quiet=TRUE)"                  
 [4] "spatialreg/man/impacts.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"         
 [5] "spatialreg/man/ME.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"              
 [6] "spatialreg/man/ME.Rd:nc.sids <- st_read(system.file(\"shapes/sids.shp\", package=\"spData\")[1], quiet=TRUE)"                   
 [7] "spatialreg/man/sarlm_tests.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"     
 [8] "spatialreg/man/sparse_mat.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"      
 [9] "spatialreg/man/SpatialFiltering.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"
[10] "spatialreg/man/spautolm.Rd:nc.sids <- st_read(system.file(\"shapes/sids.shp\", package=\"spData\")[1], quiet=TRUE)"             
[11] "spatialreg/man/trW.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"             
[12] "spatialreg/vignettes/nb_igraph.Rmd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1])"            
[13] "spatialreg/vignettes/sids_models.Rmd:nc <- st_read(system.file(\"shapes/sids.shp\", package=\"spData\")[1], quiet=TRUE)"        
[14] "spatialreg/vignettes/SpatialFiltering.Rmd:NY8 <- st_read(system.file(\"shapes/NY8_utm18.shp\", package=\"spData\"))"            
attr(,"status")
[1] 2

[[21]]
 [1] "spdep/man/autocov_dist.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"   
 [2] "spdep/man/choynowski.Rd:auckland <- st_read(system.file(\"shapes/auckland.shp\", package=\"spData\")[1], quiet=TRUE)"     
 [3] "spdep/man/columbus.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"       
 [4] "spdep/man/compon.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"         
 [5] "spdep/man/diffnb.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"         
 [6] "spdep/man/dnearneigh.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"     
 [7] "spdep/man/EBest.Rd:auckland <- st_read(system.file(\"shapes/auckland.shp\", package=\"spData\")[1], quiet=TRUE)"          
 [8] "spdep/man/EBImoran.mc.Rd:nc.sids <- st_read(system.file(\"shapes/sids.shp\", package=\"spData\")[1], quiet=TRUE)"         
 [9] "spdep/man/EBlocal.Rd:auckland <- st_read(system.file(\"shapes/auckland.shp\", package=\"spData\")[1], quiet=TRUE)"        
[10] "spdep/man/edit.nb.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"        
[11] "spdep/man/globalG.test.Rd:nc.sids <- st_read(system.file(\"shapes/sids.shp\", package=\"spData\")[1], quiet=TRUE)"        
[12] "spdep/man/graphneigh.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"     
[13] "spdep/man/include.self.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"   
[14] "spdep/man/joincount.multi.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"
[15] "spdep/man/joincount.test.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)" 
[16] "spdep/man/knearneigh.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"     
[17] "spdep/man/knn2nb.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"         
[18] "spdep/man/listw2sn.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"       
[19] "spdep/man/lm.morantest.exact.Rd:eire <- st_read(system.file(\"shapes/eire.shp\", package=\"spData\")[1])"                 
[20] "spdep/man/lm.morantest.sad.Rd:eire <- st_read(system.file(\"shapes/eire.shp\", package=\"spData\")[1])"                   
[21] "spdep/man/localGS.Rd:boston.tr <- sf::st_read(system.file(\"shapes/boston_tracts.shp\", package=\"spData\")[1])\r"        
[22] "spdep/man/localmoran_bv.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\"))"                 
[23] "spdep/man/localmoran.exact.Rd:eire <- st_read(system.file(\"shapes/eire.shp\", package=\"spData\")[1])"                   
[24] "spdep/man/localmoran.sad.Rd:eire <- st_read(system.file(\"shapes/eire.shp\", package=\"spData\")[1])"                     
[25] "spdep/man/mat2listw.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"      
[26] "spdep/man/moran.test.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"     
[27] "spdep/man/nb2lines.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"       
[28] "spdep/man/nb2listw.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"       
[29] "spdep/man/nb2mat.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"         
[30] "spdep/man/nbdists.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"        
[31] "spdep/man/nblag.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"          
[32] "spdep/man/nboperations.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"   
[33] "spdep/man/plot.nb.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"        
[34] "spdep/man/poly2nb.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"        
[35] "spdep/man/poly2nb.Rd:nc.sids <- st_read(system.file(\"shapes/sids.shp\", package=\"spData\")[1], quiet=TRUE)"             
[36] "spdep/man/probmap.Rd:auckland <- st_read(system.file(\"shapes/auckland.shp\", package=\"spData\")[1], quiet=TRUE)"        
[37] "spdep/man/SD.RStests.Rd:columbus <- sf::st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1])"             
[38] "spdep/man/sp.correlogram.Rd:nc.sids <- st_read(system.file(\"shapes/sids.shp\", package=\"spData\")[1], quiet=TRUE)"      
[39] "spdep/man/subset.listw.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"   
[40] "spdep/man/subset.nb.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"      
[41] "spdep/man/summary.nb.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"     
[42] "spdep/man/testnb.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"         
[43] "spdep/man/tri2nb.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"         
[44] "spdep/vignettes/CO69.Rmd:eire <- as(sf::st_read(system.file(\"shapes/eire.shp\", package=\"spData\")[1]), \"Spatial\")"   
[45] "spdep/vignettes/nb.Rmd:NY8 <- as(sf::st_read(system.file(\"shapes/NY8_utm18.shp\", package=\"spData\")), \"Spatial\")"    
[46] "spdep/vignettes/nb_sf.Rmd:NY8_sf_old <- st_read(system.file(\"shapes/NY8_utm18.shp\", package=\"spData\"), quiet=TRUE)"   
[47] "spdep/vignettes/sids.Rmd:nc <- st_read(system.file(\"shapes/sids.shp\", package=\"spData\")[1], quiet=TRUE)"              
attr(,"status")
[1] 2

[[22]]
[1] "spgwr/man/ggwr.Rd:xx <- as(st_read(system.file(\"shapes/sids.shp\", package=\"spData\")[1]), \"Spatial\")"         
[2] "spgwr/man/ggwr.sel.Rd:xx <- as(st_read(system.file(\"shapes/sids.shp\", package=\"spData\")[1]), \"Spatial\")"     
[3] "spgwr/vignettes/GWR.Rmd:NY8 <- as(st_read(system.file(\"shapes/NY8_utm18.shp\", package=\"spData\")), \"Spatial\")"
attr(,"status")
[1] 2

[[23]]
[1] "sphet/man/impacts.error_sphet.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"
[2] "sphet/man/impacts.ols_sphet.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"  
[3] "sphet/man/impacts.stsls_sphet.Rd:columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"
[4] "sphet/R/impacts.R:#' columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"            
[5] "sphet/R/impacts.R:#' columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"            
[6] "sphet/R/impacts.R:#' columbus <- st_read(system.file(\"shapes/columbus.shp\", package=\"spData\")[1], quiet=TRUE)"            
attr(,"status")
[1] 2

[[24]]
character(0)
attr(,"status")
[1] 2

[[25]]
character(0)
attr(,"status")
[1] 1

[[26]]
character(0)
attr(,"status")
[1] 2

[[27]]
character(0)
attr(,"status")
[1] 2

so:

bispdep: shapes/columbus.shp epiR: shapes/sids.shp GWnnegPCA: shapes/boston_tracts.shp R2BayesX: shapes/columbus.shp rflexscan: sids.shp spatialreg: shapes/wheat.shp, shapes/columbus.shp, shapes/sids.shp, shapes/NY8_utm18.shp spdep: shapes/columbus.shp, shapes/auckland.shp, shapes/sids.shp, shapes/eire.shp, shapes/boston_tracts.shp, shapes/NY8_utm18.shp spgwr: shapes/sids.shp, shapes/NY8_utm18.shp sphet: shapes/columbus.shp

I'll add issues here as I raise them, or dates of emails sent. None appear to need the shapefile representation.

edzer commented 3 months ago

I've noticed that shapefiles are used quite a lot in docs of sf

The original reason for that was that at the time of first submission, certain CRAN platforms did not have sqlite linked to gdal, and could not read gpkg. It could well be that this no longer is the case.

rsbivand commented 3 months ago
rsbivand commented 1 month ago

@Nowosad epiR has updated, R2BayesX is being submitted, the remainder have been reminded that they need to update by the end of August.

Nowosad commented 2 weeks ago

@rsbivand as it is September already -- should I submit the current version of spData to CRAN now?

rsbivand commented 2 weeks ago

I'll check tomorrow. Are we going to drop all the shapefiles now?

Nowosad commented 2 weeks ago

I think the plan is to put the current version to CRAN as it is, and drop all the shapefiles afterwards.

rsbivand commented 2 weeks ago

Ok, submit as main is now, then I'll make a PR to drop the shapefiles.

rsbivand commented 2 weeks ago

@Nowosad Please wait, I need to check the validity of shapes/NY8_bna_utm18.gpkg and shapes/NY8_utm18.gpkg.

Nowosad commented 2 weeks ago

One additional issue, @rsbivand:

Found the following (possibly) invalid URLs:
    URL: http://www.spatial-econometrics.com/html/jplv7.zip
      From: man/elect80.Rd
            man/house.Rd
      Status: Error
      Message: Failed to connect to [www.spatial-econometrics.com](http://www.spatial-econometrics.com/) port 80 after 21145 ms: Couldn't connect to server
rsbivand commented 2 weeks ago

Will update this shortly - Jim LeSage has retired and the DNS subscription has I think lapsed. I'm dropping the URL markup - if anyone needs the original, it may be found in the wayback machine: https://web.archive.org/web/20160416132456fw_/http://www.spatial-econometrics.com/html/jplv7.zip

Nowosad commented 2 weeks ago

thanks, package spData_2.3.3.tar.gz is on its way to CRAN.