opera-adt / DSWX-SAR

Dynamic Surface Water Extent from Synthetic Aperture Radar
Apache License 2.0
9 stars 6 forks source link

Documentation about Precise DSWx-S1 Inputs #63

Open cmarshak opened 5 months ago

cmarshak commented 5 months ago

From my conversations with @oberonia78 - the DSWx-S1 product is chopped up into MGRS tiles after it has been generated. To match the production version, all the bursts overlapping a given MGRS tile must be used (not more, not less) as that will affect the distribution of backscatter that is analyzed.

The relevant repository required for this accounting is here: https://github.com/opera-adt/mgrs_tiles_database

It would be nice to have documentation about how to generate production-grade DSWx-S1 products locally. Specifically, how to generate a run-config for a given MGRS tile that will mirror what is generated on the OPERA cloud system.

🙏

cmarshak commented 5 months ago

To clarify:

@oberonia78 warned me of an important difference: there is the official “OPERA” product and everything else you might get out of the DSWx-S1 software. This is because the final output of DSWx-S1 crucially depends on the input you provide (at some point, a bimodal distribution is fit and a threshold is extracted). A DSWx-S1 product generated from a single burst will be different than one generated by many (the inputs will have different distributions and hence different thresholds will be extracted - likely not too different, but could potentially problematic if a burst is coastal and an MGRS tile contains much more land). It would be nice to produce “official” OPERA products historically and on-demand on our local machines and to have such instructions to do so available. This requires instructions to call the software precisely. Maybe there could be a workflow that simply takes a day/time + MGRS tile and outputs the product. That would require the software to localize the RTC data. Seems like at a minimum there should be clear instructions to do this to generate the official products.

As a part of this, the auxiliary dataset that joins bursts and MGRS tiles is a crucial piece of this equation. Jungkyo said that using the database generated by this software over the default MGRS PyPI library will have difference in outputs. I don't know precisely why this is the case.

The distinction of creating the official product is what motivated this comment as @oberonia78 said the products we (some members of PST) were generating with this notebook might be slightly different than the official ones: https://github.com/OPERA-Cal-Val/dswx-s1-workflow-pst. Hope this explanation helps.

cmarshak commented 3 months ago

So I wanted to update my issue ticket as I misunderstood an important point. The inputs for an official DSWx-S1 product are not bursts over a single MGRS tile, but actually, a collection of MGRS tiles collected as a "frame". This accounting is done in the database repository linked above. There is also the issue of adequate spatial coverage across this frame for a DSWx-S1 product to be triggered. I look forward to understanding what both these entails and more.

Thank you @niarenaw and Luca for your explanation and patience!