Closed mradamcox closed 2 months ago
Remote load is now working! It turns out that sf supports remote load through GDAL
's virtual file systems. As such, code like
counties2010 <- st_read('/vscurl/https://raw.githubusercontent.com/GeoDaCenter/opioid-policy-scan/main/data_final/geometryFiles/county/counties2010.shp')
successfully runs.
We currently only have public facing links on S3 for cartographic boundaries, so I've set up the load_oeps
function to pull from the opioid-policy-scan GitHub for the time being. Once we do have non-cartographic boundaries setup on S3, we can edit the links pointed to by the retrieve_geometry
function in load_oeps.R
to change the source.
A good way around storing large geometry files directly in this package (like tract boundaries that can be >100mb), would be to pull them from remote sources when users run a load data command. This command would get geometries, join to the OEPS data to them, and return a dataframe that is ready for spatial analysis.
We have a data pipeline in place that merges census boundaries into single files of many different formats, and that pipeline could be augmented to deliver whatever type of spatial format that would work best for R. If we went this route, what should we make our datasets look like?