ESA-PhiLab / Major-TOM

Expandable Datasets for Earth Observation
https://huggingface.co/Major-TOM
129 stars 7 forks source link

Sentinel-1 and Sentinel-2 Match #6

Open danielz02 opened 2 months ago

danielz02 commented 2 months ago

Hi! Thanks a lot for releasing the dataset! I have a question about matching S1 and S2 data.

Based on the current processing, we estimate around 75% of the grid cells which have Sentinel-2 data will be successfully sampled with Sentinel-1 data.

I wonder how this number is calculated. Per my calculation, around 50% of the S2 grids can be matched with an S1 grid within an offset of four days. Here is how I did it

gdf_s2.sort_values("timestamp", inplace=True)
gdf_s1.sort_values("timestamp", inplace=True)

gdf_matched = pd.merge_asof(left=gdf_s2, right=gdf_s1, suffixes=("_s2", "_s1"), by="grid_cell", direction='nearest', on="timestamp", tolerance=pd.Timedelta(pd.offsets.Day(4)), allow_exact_matches=False).dropna(subset="parquet_url_s1", axis=0)

Any clarification would be much appreciated!

mikonvergence commented 2 months ago

Hi @danielz02, thanks!

This should be correct. The average time shift between Core-S2 and Core-S1 is actually about 7 days.

danielz02 commented 2 months ago

Thanks for your prompt response! How was the 75% number calculated? I tried an offset of 7 days and I can match 60% of the S2 data.