Closed adamkemberling closed 1 year ago
Hi @adamkemberling.
The rnaturalearth
package only provides an interface to https://www.naturalearthdata.com/. If it is not available there for download, it is normal that it is not available through this package. Please feel free to reopen an issue if you find any discrepancy between the data provided by the package and downloadable content from https://www.naturalearthdata.com/.
Hi @PMassicotte, Sorry it has been so long for me to check this.
Downloading data directly this morning 3/2/2023 from the link you provided. The specific dataset is ne_10m_admin_1_states_provinces
The area in question appears to exist properly at the source for the data. This would suggest that either an old version of the rnaturalearth package (or rnaturalearthdata/rnaturalearthhires) was missing the data or that it was corrupted somewhere along the way.
My current version info: rnaturalearth = 0.1.0 rnaturalearthdata = 0.1.0 rnaturalearthhires = 0.2.0
This is the code I routinely run to access the file with rnaturalearth & sf to replicate and see the missing coastline:
us_poly <- ne_states("united states of america", returnclass = "sf")
ggplot() +
geom_sf(data = us_poly) +
coord_sf(xlim = c(-78, -74), ylim = c(34, 41))
And here is a visual of what that displays:
Can you update rnaturalearth to the latest version and try again?
Problem still occurs with: rnaturalearth = 0.3.2 rnaturalearthdata = 0.1.0 rnaturalearthhires = 0.2.0
Can confirn this seems to be a problem in the rnaturalearthhires
package. Probably needs to be updated. If you use ne_download()
the correct boundaries are shown which implies that Natural Earth has updated the polygons since the last time rnaturalearthhires was updated.
library(rnaturalearth)
library(ggplot2)
library(sf)
us_poly <- ne_states("united states of america", returnclass = "sf")
va <- us_poly[us_poly$postal=="VA",]
ggplot(va) +
geom_sf()
Using ne_download()
returns proper boundaries:
states_poly_dl <- ne_download(scale = 'large', type = "states",
category = "cultural",
returnclass = "sf")
va <- states_poly_dl[states_poly_dl$name == "Virginia",]
ggplot(va) +
geom_sf()
Do you recommend using the ne_download()
function to circumvent the need to keep the data packages up to date? This is more of a general data package question I guess and not specific to rnaturalearth. People probably update packages less frequently than fixes/updates occur for them.
EDIT: But that workflow would come with more people pinging the original data source for on-the-fly downloads, and would likely come with some quality of life declines with scripts running more slowly or not running without internet...
Also, want to flag that I appreciate all of y'all's hard work. I am a big advocate of this package for providing consistency and ease-of-use to common map-making needs.
And just another comment for context. I tried updating both rnaturalearthhires
and rnaturalearthdata
when I updated rnaturalearth
today, and there was no indication that I should/could. So without some special commands to install a dev branch I am in a state that others would likely be in.
@adamkemberling I think for most interactive data analysis ne_download()
is the way to go. That will ensure you get the most recently published data from Natural Earth's repo. When you use functions that rely on rnaturalearthdata
and rnaturalearthhires
you are getting a data snapshot from whenever the last time the data housed in those respective packages were updated (looks like 5 years ago for hires).
For some automated workflows I could see where installing and using the data packages locally would be beneficial, but you are at the mercy of the last time the packages were updated by the dev/maintainer.
Maybe one option could be to cache the results of ne_download()
. Maybe with the help of pins
.
This has been fixed in ropensci/rnaturalearthhires#8
library(rnaturalearth)
library(ggplot2)
us_poly <- ne_states("united states of america", returnclass = "sf")
ggplot() +
geom_sf(data = us_poly) +
coord_sf(xlim = c(-78, -74), ylim = c(34, 41))
The section of Virginia that is North of the Chesapeake Bay bridge is missing from the Virginia polygon (Fisherman Island Nat. Wildlife refuge to Chincoteague).
For a possible swap in, or for anyone seeking a quick-fix alternative R-package based polygon that has it see {rgeoboundaries}: https://github.com/wmgeolab/rgeoboundaries