sentinel-hub / eo-learn

Earth observation processing framework for machine learning in Python
https://eo-learn.readthedocs.io/en/latest/
MIT License
1.11k stars 300 forks source link

Ingesting metadata as part of eo-learn #33

Closed Spiruel closed 5 years ago

Spiruel commented 5 years ago

When creating EOPatches and filling them with Sentinel-2 data, I would be interested in also accessing the corresponding metadata information for each capture date (eg. Solar Irradiance List, U). Can this be achieved through eo-learn?

Thanks

gmilcinski commented 5 years ago

You should be able to access all information available through Sentinel Hub. E.g. access to some meta-data is described here: https://www.sentinel-hub.com/faq/how-can-i-access-meta-data-information-sentinel-2-l2a (most of these are available for L1C as well). For meta-data, that are not pixel-based there might be a bit of a trickery required as they could be loaded from AWS, e.g.: https://roda.sentinel-hub.com/sentinel-s2-l1c/tiles/1/K/AT/2018/7/20/0/metadata.xml

I am not sure if some of the above is already available in eo-learn (others may comment on it) but if not, it should not be too difficult to add. Pull request would certainly be welcome.

devisperessutti commented 5 years ago

You can have a look at the AddSen2CorClassificationFeature task as an example on how to retrieve meta-data using an EVALSCRIPT request.

Something like the following should work.

from sentinelhub import CustomUrlParam
evalscript = 'return [sunAzimuthAngles]'
custom_url_params = {CustomUrlParam.EVALSCRIPT: evalscript}
eop_l2a = S2L2AWCSInput('BANDS-S2-L2A', 
                        feature='SUN-AZIMUTH-ANGLES', 
                        resx='10m', resy='10m', 
                        maxcc=0.8, 
                        custom_url_params=custom_url_params).execute(time_interval=time_interval, 
                                                                     bbox=bbox)
Spiruel commented 5 years ago

Thank you for your replies.

@devisperessutti , using an EVALSCRIPT request has been successful in getting hold of sunZenithAngles.

However I am also interested in non-pixel-based metadata, eg. from the MTD_MSIL1C.xml. Following @gmilcinski 's suggestion, I managed to obtain these values using AwsProductRequest in sentinelhub-py. However I required a productID in advance and I needlessly downloaded all the other non-metadata files along with it.

Is there a way to easily grab 'U' and 'SOLAR_IRRADIANCE' metadata using eo-learn?

devisperessutti commented 5 years ago

hello.

at the moment these features are not exposed by the SH service, so the process to get them is a bit more convoluted. However, the following code should help you set up a Task to retrieve those values querying only the metadata (no other files are downloaded)

# get list of tiles for bbox and time interval
wfs_iterator = WebFeatureService(bbox, time_interval, data_source=DataSource.SENTINEL2_L1C,
                                 maxcc=1.0)
# for each tile retrieve product name and query metadata to retrieve tags of interest
for tile in wfs_iterator.get_tiles():
    product_id = AwsTileRequest(tile=tile[0], time=tile[1], aws_index=tile[2], bands=[],
                                metafiles=['productInfo'],
                                data_source=DataSource.SENTINEL2_L1C).get_data()[0]['name']
    metadata = AwsProductRequest(product_id=product_id, bands=[], metafiles=['metadata']).get_data()
    u_branch = metadata[0][0][1][4][0]
    solar_irradiance_list = metadata[0][0][1][4][1]
    print(f'tile is {tile[0]} {tile[1]} {tile[2]}')
    print(f'{u_branch.tag} is {u_branch.text}')
    print(f'{solar_irradiance_list.tag} is {solar_irradiance_list[0].text}\n\n')

you would need to be careful though on how extract/combine values for patches at the intersection of different products (this is normally handled by the service)

hope this helps

Johannes-R-Schmid commented 5 years ago

@devisperessutti what about Landsat8 imagery I would like to obtain for example "K1_CONSTANT_BAND_10". There the basic code above does not seem to work after adjusting the DataSource.

devisperessutti commented 5 years ago

Hi @Johannes-R-Schmid ,

at the moment WebFeatureService, AwsTileRequest and AwsProductRequest work with Sentinel-2 only (WebFeatureSerive._parse_tile_url() fails due to different name formatting).

A work-around for now would be the following

import requests

wfs_iterator_l8 = WebFeatureService(bbox, time_interval, data_source=DataSource.LANDSAT8, maxcc=1.0)
wfs_iterator_l8.get_dates()

print(wfs_iterator_l8.tile_list)

path_to_a_tile = wfs_iterator_l8.tile_list[0]['properties']['path']

response = requests.get(path_to_a_tile + '_MTL.txt')

text = response.text

print(text[text.find('K1_CONSTANT_BAND_10'):text.find('K1_CONSTANT_BAND_11')])

Alternatively, you could download the metadata from the S3 bucket (more info here) using boto3. Refer to this post for more info.

We'll let you know if we find a better way.

Johannes-R-Schmid commented 5 years ago

Hi @devisperessutti

this works perfectly, thanks!

What about the "pixel_qa" layer as there is no respective predefined script in the WMS configurator. I would like to mask out clouds in LS8 data using that layer.

devisperessutti commented 5 years ago

To retrieve data from the QA layer of Landsat-8 please refer to this FAQ.

You would need to add an evalscript definition to the custom_url_params parameter when making the request.