nsidc / earthaccess

Python Library for NASA Earthdata APIs
https://earthaccess.readthedocs.io/
MIT License
411 stars 81 forks source link

Add method to convert results enabling GeoDataFrame.explore() to be used for creation of interactive maps #305

Open ebolch opened 1 year ago

ebolch commented 1 year ago

I have a rough example in this notebook. The end goal would be to have a simple function that would create a geodataframe and then an interactive plot, like:

results = earthaccess.search_data()
earthaccess.explore(results)

A few additional options could be things like plotting browse images or a way to limit the quantity of polygons/images displayed. This output is the interactive map from the above notebook after removing a few layers: concurrent_data_snip2

MattF-NSIDC commented 1 year ago

Thanks for the suggestion, that's a really cool idea that would be great for improving data accessibility!

We'll want to consider how this would affect earthaccess's dependencies.

jhkennedy commented 1 year ago

I think we could easily support something like this either as a plugin, or as optional dependencies following at pattern like:

try:
     import optional_package
except ImportError:
    optional_package = None

def function_that_uses_optional_package():
   if optional_package is None:
      raise ImportError('Optional Package is required to call this function')

   # function code...
andypbarrett commented 1 year ago

I like this idea but we should consider how it would work for all data granules. I'm thinking of ICESat-2 and other track data. Maybe we define these types of data as multi-lines.

scottyhq commented 11 months ago

This would be really useful! Is there currently a recommended approach to go from earthaccess search results to a geodataframe?

For example:

gf = gpd.GeoDataFrame.from_features(results.geojson(), crs='EPSG:4326')

Note: results.to_dict() is the syntax MS planetary computer uses (https://planetarycomputer.microsoft.com/docs/quickstarts/reading-stac/), but results.geojson() is what asf_search does (https://github.com/asfadmin/Discovery-asf_search/blob/bee8d92631780a988cbb1988266270241ebea23f/asf_search/ASFSearchResults.py#L23)

It seems a straightforward approach could be to include STAC as an output from CMR (https://cmr.earthdata.nasa.gov/search/site/docs/search/api.html), which is a GeoJSON feature collection and therefore can therefore be easily converted to a geodataframe.

jhkennedy commented 6 months ago
 gf = gpd.GeoDataFrame.from_features(results.geojson(), crs='EPSG:4326')

:clap: yes, I'd love to be able to do this line and use it frequently with asf_search. @iamdonovan also recently asked about this [edited for clarity]:

should I suggest/request a "to_dataframe" method?

ebolch commented 6 months ago

This somewhat exists in the explore branch that @betolink was working on, although its part of the SearchWidget class. I think it would be valuable even separate from the integration of an explore feature. There's a slightly modified version of that code in this module that LP DAAC used for a tutorial.

scottyhq commented 1 month ago

I took a stab at directly integrating @ebolch's code into earthaccess here, and could open a PR if this seems like a useful feature and approach https://github.com/nsidc/earthaccess/compare/main...scottyhq:earthaccess:togeopandas

But I have a feeling that supporting STAC-formatted returns is a more robust approach (https://github.com/nsidc/earthaccess/discussions/221)? Specifically, if the search returns a FeatureCollection in the first place there is no need for coercing CMR Polygons to shapely geometries.

ebolch commented 4 weeks ago

I'm not sure that the implementation of an explore function that I originally had in mind is within the scope of earthaccess. But I agree with @scottyhq that the geopandas output is really useful. It enables users to get a step closer to some more complex filtering and organize the data in a readable structure for processing. Currently for LP datasets with multiple assets per granule, filtering and only opening the desired assets often requires convoluted string matching and list comprehension to get additional metadata from the umm metadata and winnow the granule links. Doing this stuff is tough for newer users and I think a geodataframe would be easier for them to work with.