c-scale-community / use-case-waterwatch

Apache License 2.0
0 stars 0 forks source link

Real time data #1

Open sustr4 opened 2 years ago

sustr4 commented 2 years ago

Hi! In the kick-off meeting we were discussing the need for (near-)real time data. I offered to use the DataHub Relay that we operate at CESNET for ESA. It is a short-term global cache focusing on fast redistribution. I made a survey of redistribution delays in the pipeline over past two weeks and these are median results:

Platform Sensing -> Ingestion Ingestion -> Availability at DHR
S1 03:02:34 00:06:05
S2 03:41:21 00:23:32
S3 28:09:09 00:05:20
S5p 36:59:30 00:11:09

In this table, the first delay in each row occurs at ESA, the second occurs on the way from ESA to CESNET, showing how much availability from CESNET is delayed behind availability from ESA.

Data are sourced from ESA collaborative nodes, except S5p, which is sourced from the S5 preview datahub.

Note that this is for fresh (near-real time) data. For archive data we either use what CESNET has (Czech Republic mainly), build an additional archive for a selected region at CESNET, or use an additional source.

sustr4 commented 2 years ago

We discussed this histogram in today's meeting

image

ArjenHaag commented 2 years ago

Did a quick check of latency in Google Earth Engine, and it seems this has improved substantially recently. It used to be in the order of 2-4 days, but now it is usually less than a day. See charts below, showing latency (in hours) for Sentinel-1 / -2 for the last 100 images (as of writing) available over a region in the Mekong river basin.

Sentinel-1 (difference between 'GRD_Post_Processing_stop' and GEE ingestion time) [hours]: S1

Sentinel-2 (difference between 'GENERATION_TIME' and GEE ingestion time) [hours]: S2