intake / intake_geopandas

An intake plugin for loading datasets with geopandas
BSD 2-Clause "Simplified" License
15 stars 7 forks source link

intake_regionmask #20

Closed aaronspring closed 3 years ago

aaronspring commented 3 years ago

I really like https://github.com/mathause/regionmask. A regionmask.region is a geopandas shapefile with specified columns names for region name, abbreviation,... to be specified in the yaml file. It’s quite specific but I think this could be a new driver mostly inheriting from GeoPandasFileSource

martindurant commented 3 years ago

Sounds good to me! If you end up adding new drivers, please do update the entry for intake_geopandas at https://intake.readthedocs.io/en/latest/plugin-directory.html

aaronspring commented 3 years ago

added regionmask to intake_geopandas would add many dependencies: https://github.com/mathause/regionmask/blob/master/requirements.txt I guess it would be better to create intake_regionmask, isnt it?

in the yaml I would specify: location,names,abbrevs,numbers.

This is how regionmask works:

gdf = geopandas.read_file(location)
regionmask.region(gdf,names,abbrevs,numbers), where names,abbrevs,numbers are the names for gdf columns.
martindurant commented 3 years ago

Intake drivers should be structured to only import their requirements when instantiating the driver or importing the specific module, so it can be OK for it to need many extra packages, so long as this is documented, of course.

ian-r-rose commented 3 years ago

@aaronspring I'm unfamiliar with regionmask, so I don't have much perspective on whether it is more appropriate to be included as a separate package or in this repo. But I second @martindurant's suggestion that the extra requirements could be imported upon instantiation of a regionmask source. We could also include a regionmask extra_requires in the package specification.