yannforget / landsatxplore

Search and download Landsat scenes from EarthExplorer.
MIT License
223 stars 98 forks source link

acquisition_date seems to not exist in metadata #35

Closed veroandreo closed 3 years ago

veroandreo commented 3 years ago

While updating the GRASS GIS addon i.landsat.download to support the recent changes in landsatxplore (Thanks much!!!), I found out that metadata regarding date has changed in both collection 1 and 2 Landsat data. There's not such thing as acquisitionDate anymore nor acquisition_date, but now there is a dictionary temporalCoverage with keys start and end date or a different key called publishDate. See https://github.com/OSGeo/grass-addons/pull/450.

Where did you find acquisition_date variable when updating the README https://github.com/yannforget/landsatxplore/commit/f44f12232360b8cea5d4057c331b30245cb4db06 ? If I perform the search and print scenes, I do not see it.

yannforget commented 3 years ago

Thanks for the bug report!

That is weird. It works for me:

scenes = api.search(
    dataset='landsat_ot_c2_l2',
    longitude=-78.77428134,
    latitude=35.68792712,
    start_date='2015-01-01',
    end_date='2015-07-01',
    max_cloud_cover=80
)

for scene in scenes:
    print(scene["acquisition_date"])

Output:

2015/06/30
2015/06/21
2015/06/14
2015/05/29
2015/05/20
2015/05/13
2015/05/04
2015/04/27
2015/04/18
2015/04/02
2015/03/26
2015/03/17
2015/03/01
2015/02/13
2015/02/06
2015/01/28
2015/01/21
2015/01/05

EarthExplorer provides two types of metadata for scenes: summary and full. temporalCoverage is always available. acquisition_date is only available when requesting metadata with metadataType=full. Maybe the response differs depending on the account permissions...

yannforget commented 3 years ago

v0.12.1 should fix it

veroandreo commented 3 years ago

Thanks for the bug report!

That is weird. It works for me:

scenes = api.search(
    dataset='landsat_ot_c2_l2',
    longitude=-78.77428134,
    latitude=35.68792712,
    start_date='2015-01-01',
    end_date='2015-07-01',
    max_cloud_cover=80
)

for scene in scenes:
    print(scene["acquisition_date"])

Output:

2015/06/30
2015/06/21
...

EarthExplorer provides two types of metadata for scenes: summary and full. temporalCoverage is always available. acquisition_date is only available when requesting metadata with metadataType=full. Maybe the response differs depending on the account permissions...

mhm... I guess I have a basic/normal account, I can't reproduce the above print()... Where is this metadataType setting? Is it something I can set with landsatxplore?

veroandreo commented 3 years ago

That was fast!! Thanks!

veroandreo commented 3 years ago

I have just discovered that acquisition_date is not there for landsat 8 collection 2 level 1 data, and it yields 'NADIR' value in landsat 8 collection 1 data... so, doesn't seem very reliable...

Try this, for example:

scenes = api.search(
    dataset='landsat_8_c1',
    longitude=-78.77428134,
    latitude=35.68792712,
    start_date='2018-08-24',
    end_date='2018-12-21',
    max_cloud_cover=15)

for scene in scenes:
    print(scene["acquisition_date"])

and this:

scenes = api.search(
    dataset='landsat_ot_c2_l1',
    longitude=-78.77428134,
    latitude=35.68792712,
    start_date='2018-08-24',
    end_date='2018-12-21',
    max_cloud_cover=15)

for scene in scenes:
    print(scene["acquisition_date"])

not to mention that, when present, it's not properly formatted as date, hence cannot be used for sorting directly... sniff :(

griembauer commented 3 years ago

Hi all! I am trying to update i.sentinel.download similarly to you @veroandreo with the new landsatxplore version. For the Sentinel-2 I have the same problem that acquisition_date is no longer in the metadata, however there is now acquisition_date_start and acquisition_date_end - since they are pretty close I think one of them would to the trick. Maybe this value exists for the collection you are searching for?

yannforget commented 3 years ago

Hi all and thanks for the inputs,

Yes, metadata field names and date formats are all over the place for now. I'm currently working on an update that harmonizes metadata field names and data types across the datasets, with automatic conversion to shapely Polygons for spatial objects and python datetimes for temporal values.

So in the next release (shoud be today) acquisition_date will always return a valid python datetime.

veroandreo commented 3 years ago

Hi all and thanks for the inputs,

Yes, metadata field names and date formats are all over the place for now. I'm currently working on an update that harmonizes metadata field names and data types across the datasets, with automatic conversion to shapely Polygons for spatial objects and python datetimes for temporal values.

So in the next release (shoud be today) acquisition_date will always return a valid python datetime.

These are great news, @yannforget ! We appreciate it!! Please be aware of https://github.com/yannforget/landsatxplore/issues/35#issuecomment-786117117: acquisition_date is not present for landsat 8 collection 2 level 1 data, and it yields 'NADIR' value in landsat 8 collection 1 data

yannforget commented 3 years ago

The issue should be fixed in v0.13.0 available on pypi. acquisition_date will now always return a datetime regardless of the dataset. temporal_coverage is a list of 2 datetimes (start and end). Note that the acquisitionDate and temporalCoverage keys do not exist anymore.

veroandreo commented 3 years ago

Thanks @yannforget :)

Before updating i.landsat.download once again:

@griembauer these changes might affect i.sentinel.download proposed changes https://github.com/OSGeo/grass-addons/pull/419 as well

yannforget commented 3 years ago

Yes, the NADIR bug is fixed now

Regarding i.landsat.download, you would have to update:

  1. The metadata dictionary keys: entityId -> entity_id, display_id -> display_id, acquisitionDate -> acquisition_date, cloudCover -> cloud_cover, etc.
  2. The calls to ee.download as the scene_id= argument has been replaced by identifier=
  3. The post-processing of the metadata values. Cloud cover is now returned as a float and acquisition date as a python datetime
veroandreo commented 3 years ago

Thanks a lot @yannforget !