Closed aaronspring closed 4 years ago
trying to fix in https://github.com/informatics-lab/covid19-examples-and-docs/pull/18
@tam203 I could reproduce your error. where did you get the shapefiles from? any idea how to read them into geopandas?
Hi @aaronspring .
So the data I used is on the public data set we've pushed up. Here is the (awful) index page.
here is the data file I used: global_daily_precip_max_20200106.nc
and here the shape file Counties_and_Unitary_Authorities_April_2019_Boundaries_EW_BUC.shp
I found these. I am looking for a nice way how to read them into geopandas.
normally I would just download them manually off the internet. when I did that, I got reasonable column names. what I search for here is a way to download them at least locally into the notebook environment (at runtime, not via git) and open them.
or where did you download these shapefiles from? are those internal use only or did you get them from another resources off the internet?
what I tried:
url='https://metdatasa.blob.core.windows.net/covid19-response/shapefiles/England/Counties_and_Unitary_Authorities_April_2019_Boundaries_EW_BUC.shp'
file_data = BytesIO(BlobClient.from_blob_url(url).download_blob().readall())
geopandas.read_file(url)
---------------------------------------------------------------------------
CPLE_OpenFailedError Traceback (most recent call last)
fiona/_shim.pyx in fiona._shim.gdal_open_vector()
fiona/_err.pyx in fiona._err.exc_wrap_pointer()
CPLE_OpenFailedError: '/vsimem/f6ede516e91146f38d179a39f4036dcc' not recognized as a supported file format.
During handling of the above exception, another exception occurred:
DriverError Traceback (most recent call last)
<ipython-input-14-f6bf670bb6cb> in <module>
----> 1 geopandas.read_file(url)
/srv/conda/envs/notebook/lib/python3.7/site-packages/geopandas/io/file.py in read_file(filename, bbox, mask, rows, **kwargs)
87
88 with fiona_env():
---> 89 with reader(path_or_bytes, **kwargs) as features:
90
91 # In a future Fiona release the crs attribute of features will
/srv/conda/envs/notebook/lib/python3.7/site-packages/fiona/collection.py in __init__(self, bytesbuf, **kwds)
537 # Instantiate the parent class.
538 super(BytesCollection, self).__init__(self.virtual_file, vsi=filetype,
--> 539 encoding='utf-8', **kwds)
540
541 def close(self):
/srv/conda/envs/notebook/lib/python3.7/site-packages/fiona/collection.py in __init__(self, path, mode, driver, schema, crs, encoding, layer, vsi, archive, enabled_drivers, crs_wkt, ignore_fields, ignore_geometry, **kwargs)
152 if self.mode == 'r':
153 self.session = Session()
--> 154 self.session.start(self, **kwargs)
155 elif self.mode in ('a', 'w'):
156 self.session = WritingSession()
fiona/ogrext.pyx in fiona.ogrext.Session.start()
fiona/_shim.pyx in fiona._shim.gdal_open_vector()
DriverError: '/vsimem/f6ede516e91146f38d179a39f4036dcc' not recognized as a supported file format.
ok there seems to be no bug. before I just downloaded a different shapefile named shapefile = '/Users/aaron.spring/Downloads/UK_covid_reporting_regions.shp'
, dontknow from where.
while the wget extension is still not very nice, at least it works.
@aaronspring just seen the above. I'll digest but feel free to ignore this.
I can't remember the original source. @kaedonkers might know.
We've uploaded these and other shape files to an Azure blob container, as I say this is the index page the README for the data set is here
I don't think that this is what you are asking @aaronspring but this notebook has some examples of working with things in the blob store (but you can also just user urllib or whatever) -
@aaronspring The UK shapefile is a manually curated one for all the COVID reporting regions in the UK. Here is a link to download it from our provision on Azure. The other country shapefiles are from https://gadm.org/download_country_v3.html
Does that answer the questions you were asking?
@aaronspring I'm not sure I'm providing the clarity to help you help us. Shall we hop on a call to discuss? If you drop an email to covid19@informaticslab.co.uk we can arrange something.
The work you done looks really exciting I'm just want to make sure we can get the most out of it.
@aaronspring The UK shapefile is a manually curated one for all the COVID reporting regions in the UK. Here is a link to download it from our provision on Azure. The other country shapefiles are from https://gadm.org/download_country_v3.html
Does that answer the questions you were asking?
I am looking for the source of https://metdatasa.blob.core.windows.net/covid19-response/shapefiles/UK/UK_covid_reporting_regions.shp
or a way to download these files into the binder local environment, because geopandas
needs to open .shp
,.shx
and .dbf
at the same time to read in a shapefile. the current way is quite manual and I hope to find a way to get opening a shapefile from azure blob into a clean function.
found a quick solution with intermediate files:
base = 'https://metdatasa.blob.core.windows.net/covid19-response/shapefiles/England'
name = 'Counties_and_Unitary_Authorities_April_2019_Boundaries_EW_BUC'
for ending in ['shx','dbf','shp']:
filename = f"{name}.{ending}"
url = f'{base}/{filename}'
with open(filename, "wb") as f:
print(f'Download {url} to {filename}')
data = BlobClient.from_blob_url(url).download_blob()
data.readinto(f)
now working on a way how to get more files from blob into xarray
should be: reading with
geopandas
now: manual download with !wget and then
geopandas.read_file(.shp or .shp)
idea to fix: load azure blob directly into geopandas