Element84 / earth-search

Earth Search information and issue tracking
https://earth-search.aws.element84.com/v1
27 stars 2 forks source link

Invalid / partial XML granule_metadata.xml files #6

Open piyushrpt opened 1 year ago

piyushrpt commented 1 year ago

We are noticing an increased number of partial / corrupted and invalid XML metadata files over the last few days. The imagery is fine but when we try to retrieve information from the granule_metadata.xml files these are usually truncated/ corrupted compared to the ones in the original scihub granules. Possibly in an issue with copying these out in chunks from the original source

Example: S2A_19VDG_20230604_0_L2A https://earth-search.aws.element84.com/v1/collections/sentinel-2-l2a/items/S2A_19VDG_20230604_0_L2A https://sentinel-cogs.s3.us-west-2.amazonaws.com/sentinel-s2-l2a-cogs/19/V/DG/2023/6/S2A_19VDG_20230604_0_L2A/granule_metadata.xml

piyushrpt commented 1 year ago

At current count, number of impacted granules are more than 3500 over the last week.

piyushrpt commented 1 year ago

Looks like another 750-800 scenes have this issue since the last time I reported. Is there an alternate way to handle this - other than going back to the scihub granules for the xml metadata. This is holding up analytics pipelines that rely on detailed metadata from the xml file.

piyushrpt commented 1 year ago

Any updates on this?

This had also been reported earlier here: https://github.com/cirrus-geo/cirrus-earth-search/issues/39

matthewhanson commented 1 year ago

@piyushrpt Currently resolving the other issues but will be looking at this problem this week.

tonykgill commented 12 months ago

Hi folks,

I'm doing some backprocessing for tiles over Australia and am seeing a few truncated metadata files too. List attached. I'll add more if I find any. The problem seems to be constrained to early June.

extract-2023-09-01T00_15_01.758Z.csv

This is not holding us up. We fall back to using the metadata in the Frankfurt s3 bucket, s3://sentinel-s2-l2a/tiles, which is fine for the corresponding files.

Tony