Closed backeb closed 2 years ago
@Jaapel I finally was able to add an experimental feature to our backend so that overviews are used when you work at lower resolutions. This is a very basic example, note this line where I set to 'experimental' feature flag, which is important to make it work, all against openeo-dev.vito.be:
from openeo.processes import lte, eq
rgb = connection.load_collection("TERRASCOPE_S2_TOC_V2",
spatial_extent={'west':3.758216409030558,'east':4.087806252,'south':51.291835566,'north':51.3927399},
temporal_extent=["2020-03-11","2020-03-15"],bands=['B04'],properties={
"eo:cloud_cover": lambda cc:eq(cc, 50 )
} )
rgb._pg.arguments['featureflags'] = {"experimental": True}
#specify process graph
download = rgb.min_time().resample_spatial(resolution=80,projection=3857).download("/tmp/openeo-rgb-sen2cor-manyclouds-resampled.tiff")
Can you integrate this in your code and do a test run on a larger scale?
By the way, can you confirm that the full processing of spain will also work against a lower resolution? This is quite important for getting a view on data needs.
@jdries I can try it tomorrow, worked today on an example with all the data using the .resample
method.
Do you know how both resampling and this new experimental feature work with masks or missing data? When upsampling, do NaN
values affect the result?
Also caching DataCubes causes missing metadata errors, as described here, which makes quick iteration on larger datasets difficult. Todays run Took ~4 hours to complete. If you have some time this sprint, I can guide you through how I set it up!
@Jaapel upsampling can indeed have various approaches to NaN values, but when we speed things up by using the overviews in the native products, we can't control that anymore. Also for the sceneclassification, I don't really know what was to generate them. Will be interesting to compare results perhaps.
We clearly need to work on this load_result to simpllify the caching, but this experimental use of overviews also has the potential to drastically reduce that 4 hours job duration.
@jdries any place where I can find information about how resampling / upsampling method work with NaN / filtered values?
It seems that both openEO and GDAL explicitly mention how NODATA/valid pixels are treated, per resampling method: https://gdal.org/programs/gdalwarp.html#cmdoption-gdalwarp-r https://processes.openeo.org/#resample_spatial
I have been searching through Sentinel-2 docs, but unfortunately cannot find which resampling method is used to generate overviews.
This is great @jdries ! Let me try to see if I can improve the masking in the algorithm.
The overarching goal is to work towards running the Aquamonitor workflow on
18 Feb 15h00 CET
We have recreated the stac server to use an nfs moutend PVC. We have also have recreated the spark-executor/driver to also mount the same nfs enabled PVC to the /opt/workdir/ path. The python script we created to download the data is running on the stac server and is currencly downloading the data into the said nfs enabled PVC. We believe that this solution is the best one to allow the spark-executor/driver access to the downloaded products. The remaining work to be done is to somehow enable access to the data inside the spark-executer/driver pod as an existing collection, so that @Jaapel can use his jupyter notebook for processing the data.
Update https://github.com/c-scale-community/use-case-aquamonitor/issues/23#issuecomment-1044437607
Next steps
See above dependency
See above dependency
9 March 4-5pm
Hi all,
I have been able to test the remote S3 access to CreoDIAS in openEO, and gotten it to work.
The main next step is for INCD to get an S3 access key and secret key for use in the use case, but I guess we need to wait for the amendment to the VA?
After that, to go further: INCD (Zacarias) will have to update openEO to latest versions. Quite a lot has changed since we did the initial deploy, and I still needed a small change to get it working. Then we'll need to add a few environment variables for the connection to CreoDIAS: AWS_S3_ENDPOINT: "s3.cloudferro.com" AWS_DIRECT: "TRUE" AWS_ACCESS_KEY_ID: "THE KEY ID" AWS_SECRET_ACCESS_KEY: "SECRET" AWS_DEFAULT_REGION: "RegionOne" AWS_REGION: "RegionOne" AWS_HTTPS: "YES" AWS_VIRTUAL_HOSTING: "FALSE"
This will need to happen in a yaml file similar to this one: https://github.com/Open-EO/openeo-geotrellis-kubernetes/blob/master/kubernetes/openeo.yaml
After that, we should be able to use layers from CreoDIAS on INCD.
best regards, Jeroen
Follow up meeting: 25 March, 12h00 CET
Requires an indexing job to be run locally to index data at provider and create the STAC metadata
I know we don't get a fresh start, but shouldn't the data be registered in STAC by the downloader? I guess it knows what it just downloaded, right? We can think of a one-time solution to register what has already been downloaded before, but that would be a one-time hack.
please update on progress related to centralised STAC catalogue service
User/Access management seems to be the greatest issue now. Would it be possible to set up an IP filter before we have proper access control? That would mean someone (INCD?) specifying IP addresses (ranges) that can access the catalogue. Just asking: It may not be needed in the end.
hi Zdenek
for INCD that would be 194.210.120.0/23
best
Mario
On 09/03/22 15:52, Zdeněk Šustr wrote:
Requires an indexing job to be run locally to index data at provider and create the STAC metadata
I know we don't get a fresh start, but shouldn't the data be registered in STAC by the downloader? I guess it knows what it just downloaded, right? We can think of a one-time solution to register what has already been downloaded before, but that would be a one-time hack.
please update on progress related to centralised STAC catalogue service
User/Access management seems to be the greatest issue now. Would it be possible to set up an IP filter before we have proper access control? That would mean someone (INCD?) specifying IP addresses (ranges) that can access the catalogue. Just asking: It may not be needed in the end.
— Reply to this email directly, view it on GitHub https://github.com/c-scale-community/use-case-aquamonitor/issues/23#issuecomment-1063069819, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABRFFWA4EUIZTOC7EDUTRDLU7DCKLANCNFSM5MU4TQEQ. Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
You are receiving this because you were mentioned.Message ID: @.***>
Should we close this one?
Yes!
From: Sebastian Luna-Valero @.> Sent: Tuesday, October 25, 2022 5:24:29 PM To: c-scale-community/use-case-aquamonitor @.> Cc: Björn Backeberg @.>; Mention @.> Subject: Re: [c-scale-community/use-case-aquamonitor] Sprint 3: 31 Jan - 4 Feb (Issue #23)
Caution: This message was sent from outside of Deltares. Please do not click links or open attachments unless you recognize the source of this email and know the content is safe. Please report all suspicious emails to @.***" as an attachment.
Should we close this one?
— Reply to this email directly, view it on GitHubhttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fc-scale-community%2Fuse-case-aquamonitor%2Fissues%2F23%23issuecomment-1290743581&data=05%7C01%7C%7C70041f25c0d64e17aee908dab69d02d2%7C15f3fe0ed7124981bc7cfe949af215bb%7C0%7C0%7C638023082723015134%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=LTAAW1B6IDBK15VP49611C4%2BNdzK8nQ84quSBkwtNNY%3D&reserved=0, or unsubscribehttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAHAI3NXWWM6TUCWATR63J33WE73S3ANCNFSM5MU4TQEQ&data=05%7C01%7C%7C70041f25c0d64e17aee908dab69d02d2%7C15f3fe0ed7124981bc7cfe949af215bb%7C0%7C0%7C638023082723015134%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=aPH%2BFoAYA0Y%2FyZreRFVcVvXDtIgTgy%2BT7Lfz%2FxuBxY0%3D&reserved=0. You are receiving this because you were mentioned.Message ID: @.***>
DISCLAIMER: This message is intended exclusively for the addressee(s) and may contain confidential and privileged information. If you are not the intended recipient please notify the sender immediately and destroy this message. Unauthorized use, disclosure or copying of this message is strictly prohibited. The foundation 'Stichting Deltares', which has its seat at Delft, The Netherlands, Commercial Registration Number 41146461, is not liable in any way whatsoever for consequences and/or damages resulting from the improper, incomplete and untimely dispatch, receipt and/or content of this e-mail.
Thanks!
Notes from sprint planning meeting can be found here: https://confluence.egi.eu/display/CSCALE/2022-01-20+Planning+the+next+Aquamonitor+sprint
Objectives
The overarching goal is to work towards running the Aquamonitor workflow on
Make the data available for the use case
Progress on Notebook (MVP)
cc @gena @Jaapel @gdonvito @mariojmdavid @jopina @jorge-lip