nsidc / earthaccess

Python Library for NASA Earthdata APIs
https://earthaccess.readthedocs.io/
MIT License
372 stars 70 forks source link

Eliminate list of lists in data_links call #493

Open alexishunzinger opened 3 months ago

alexishunzinger commented 3 months ago

[granule.data_links(access="direct") for granule in results] returns a list of 1-item lists that contain the S3 URLs.

Result:

[['s3://gesdisc-cumulus-prod-protected/OCO2_DATA/OCO2_L2_Lite_FP.11.1r/2020/oco2_LtCO2_200704_B11100Ar_230603215457s.nc4'],
 ['s3://gesdisc-cumulus-prod-protected/OCO2_DATA/OCO2_L2_Lite_FP.11.1r/2020/oco2_LtCO2_200705_B11100Ar_230603215543s.nc4']]

I want the URLs just as strings in a single list. Adding 1st index to the function call yields this desired return. [granule.data_links(access="direct")[0] for granule in results]

Result:

['s3://gesdisc-cumulus-prod-protected/OCO2_DATA/OCO2_L2_Lite_FP.11.1r/2020/oco2_LtCO2_200704_B11100Ar_230603215457s.nc4',
 's3://gesdisc-cumulus-prod-protected/OCO2_DATA/OCO2_L2_Lite_FP.11.1r/2020/oco2_LtCO2_200705_B11100Ar_230603215543s.nc4']

This is a solution any user can try, but is it kosher to update this in the earthaccess sample code? Any issues this brings?

mfisher87 commented 3 months ago

I believe they are returned as a list because a granule can have many data links. For the data I've used, I believe it's always been 1 data link, and I'm not sure how common it is to have multiple. Maybe itertools.chain for this?

links = itertools.chain(granule.data_links(access="direct") for granule in results)