opera-adt / COMPASS

COregistered Multi-temPorAl Sar Slc
Apache License 2.0
38 stars 16 forks source link

[New Feature]: Clarification on running workflow locally #216

Open scottyhq opened 5 months ago

scottyhq commented 5 months ago

Checked for duplicates

Yes - I've already checked

Alternatives considered

Yes - and alternatives don't suffice

Related problems

Thanks for making this code public! We're interested in creating CLSCs at different global locations back to 2014.

As a test I tried to recreate this file from ASF locally: OPERA_L2_CSLC-S1_T013-026571-IW1_20231007T142223Z_20231009T193106Z_S1A_VV_v1.0.h5

Each HDF contains its own runconfig, but the paths to ancillary data are hardcoded and it's not clear how to get those files in place :

runconfig:
  groups:
    dynamic_ancillary_file_group:
      dem_description: Digital Elevation Model (DEM) for the NASA OPERA project version
        1.1 (v1.1) based on the Copernicus DEM 30-m and Copernicus 90-m referenced
        to the WGS84 ellipsoid
      dem_file: /home/compass_user/input_dir/dem.vrt
      tec_file: /home/compass_user/input_dir/JPL0OPSRAP_20232800000_01D_02H_GIM.INX
      weather_model_file: null
    input_file_group:
      burst_id: null
      orbit_file_path:
      - /home/compass_user/input_dir/S1A_OPER_AUX_RESORB_OPOD_20231007T154954_V20231007T120659_20231007T152429.EOF
      safe_file_path:
      - /home/compass_user/input_dir/S1A_IW_SLC__1SDV_20231007T142221_20231007T142248_050660_061A8B_76E7.zip
    output:
      chunk_size:
      - 128
      - 128
      compression_enabled: true
      compression_level: 4
      cslc_data_type: complex64_zero_mantissa
      shuffle: true
    pge_name_group:
      pge_name: CSLC_S1_PGE
    primary_executable:
      product_type: CSLC_S1
    processing:
      correction_luts:
        azimuth_spacing: 0.028
        enabled: true
        range_spacing: 120
        troposphere:
          delay_type: wet_dry
      geo2rdr:
        lines_per_block: 1000
        numiter: 25
        threshold: 1.0e-08
      geocoding:
        flatten: true
        x_posting: 5
        x_snap: null
        y_posting: 10
        y_snap: null
      polarization: co-pol
      rdr2geo:
        compute_ground_to_sat_east: true
        compute_ground_to_sat_north: true
        compute_height: false
        compute_latitude: false
        compute_layover_shadow_mask: true
        compute_local_incidence_angle: true
        compute_longitude: false
        extraiter: 10
        lines_per_block: 1000
        numiter: 25
        threshold: 1.0e-08
    product_path_group:
      product_path: /home/compass_user/output_dir
      product_specification_version: 0.1.7
      product_version: '1.0'
      sas_output_file: /home/compass_user/output_dir
      scratch_path: /home/compass_user/scratch
    quality_assurance:
      browse_image:
        complex_to_real: amplitude
        enabled: true
        equalize: false
        gamma: 0.5
        percent_high: 95
        percent_low: 0
      output_to_json: false
      perform_qa: true
    static_ancillary_file_group:
      burst_database_file: /home/compass_user/input_dir/opera_burst_database.sqlite3
    worker:
      gpu_enabled: false
      gpu_id: 0
      internet_access: false
  name: cslc_s1_workflow_default

Describe the feature request

Add documentation on full process of creating a CSLC, in particular how to get ancillary files:

      tec_file: /home/compass_user/input_dir/JPL0OPSRAP_20232800000_01D_02H_GIM.INX
...
      burst_database_file: /home/compass_user/input_dir/opera_burst_database.sqlite3
cmarshak commented 5 months ago

@scottyhq - I don't know anything about this repository... I heard about this issue ticket via @alhandwerger.

I have created workflows for other OPERA packages to generate the products locally. They are not pretty or elegant. Most are WIP.

https://github.com/OPERA-Cal-Val/dswx-hls-pst-workflow/blob/dev/0-Run-DSWX-HLS.ipynb https://github.com/OPERA-Cal-Val/dswx-s1-workflow-pst/blob/dev/0_DSWx-SAR-Workflow.ipynb

Basically, I have created software to pull data from various aws public buckets and stitch the tiles together including:

https://github.com/ACCESS-Cloud-Based-InSAR/dem-stitcher https://github.com/OPERA-Cal-Val/tile-mate

I don't know about the TEC data.

I have never created the official burst database... Know this is the relevant software: https://github.com/opera-adt/burst_db

Btw - I use your gists like once a month or more. Last week - I used this to help write a proposal. We still don't have historical OPERA RTC data either :(

scottyhq commented 5 months ago

Thanks for the pointers @cmarshak! With @scottstanie 's updates to the opera-db repository I'm now able to run s1_cslc.py and generate a CLSC, but I'm still unable to match the standard product through ASF. I suspect there may be a difference in either the DEM i'm using, or some software version difference (I've at least matched isce3=0.15.0 and compass=0.5.4)?

For opera_burst_database.sqlite3 I've used opera-burst-bbox-only.sqlite3 which has the following structure with UTM coordinates:

FID Column = rowid
burst_id_jpl: String (0.0)
epsg: Integer (0.0)
xmin: Real (0.0)
ymin: Real (0.0)
xmax: Real (0.0)
ymax: Real (0.0)

You can get the needed TEC file for the above example from NASA CDDIS wget --auth-no-challenge https://cddis.nasa.gov/archive/gnss/products/ionex//2023/280/JPL0OPSRAP_20232800000_01D_02H_GIM.INX.gz

Finally, for the dem, I've just used sardem --bbox -123.0 45.0 -120 48.0 --data COP

With all the auxiliary files now specified, the workflow completes and the file structure is as expected but the array values do not quite match the standard product. linear amplitude difference plotted below along with the full log

Screenshot 2024-01-29 at 8 20 36 AM

I did notice the log reports ERROR 6: IH5:::ID=360287970189640723: Dataset does not support the SetSpatialRef() method. a number of times. I think that is coming from ISCE3 but not sure...

full log ``` journal: Starting s1_geocode_slc burst journal: Starting geocoding of t013_026571_iw1 for 20231007 journal: -- DEM EPSG: 4326 -- Output EPSG: 4326 journal: -- Processing block: 1 -- - line start: 0 -- - line end : 110 -- - dopplers near mid far: 0 0 0 journal: -- Actual DEM bounds used: -- Top Left: -122.842 47.1037 -- Bottom Right: -121.158 46.4476 -- Spacing: 0.000277778 -0.000277778 -- Dimensions: 6062 2363 Topo progress (block 1/1): 100% journal: Total convergence: 31715 out of 45870 PYSOLID: ---------------------------------------- PYSOLID: datetime: 2023-10-07T14:22:23.125602 PYSOLID: SNWE: (47.0975958781724, 46.522595878172396, -122.6425044092228, -120.3425044092228) SOLID : calculate solid Earth tides in east/north/up direction SOLID : shape: (25, 100), step size: 0.0230 by 0.0230 deg Chi squared: 0.000360 journal: -- Actual DEM bounds used: -- Top Left: -122.604 47.0068 -- Bottom Right: -121.267 46.5432 -- Spacing: 0.000277778 -0.000277778 -- Dimensions: 4815 1670 /home/jovyan/.local/envs/compass_v1/lib/python3.11/site-packages/compass/utils/browse_image.py:165: RuntimeWarning: invalid value encountered in cast image = np.uint8(image * (254)) + 1 journal: -- start X: 531420.000000 -- end X: 630720.000000 -- start Y: 5204460.000000 -- end Y: 5157360.000000 -- spacing X: 100.000000 -- spacing Y: -50.000000 -- width: 993 -- length: 942 -- epsg: 32610 journal: -- geo2rdr threshold: 1e-08 -- geo2rdr numiter: 25 -- baseband azimuth spectrum (0:false, 1:true): 0 -- flatten phase (0:false, 1:true): 0 -- remove phase screen (0: false, 1: true): 0 -- apply azimuth offset (0: false, 1: true): 0 -- apply range offset (0: false, 1: true): 0 -- nbands: 1 -- flag_apply_rtc (0:false, 1:true): 0 -- geocode memory mode: auto -- min. block size: 32MB -- max. block size: 256MB journal: -- array length: 942 -- array width: 993 -- number of block(s): 1 -- block length: 942 -- block width: 993 -- block size: 4MB journal: -- number of blocks: 1 -- block length: 942 -- -- starting geocoding journal: block: 0 journal: -- Actual DEM bounds used: -- Top Left: -122.604 47.0068 -- Bottom Right: -121.267 46.5437 -- Spacing: 0.000277778 -0.000277778 -- Dimensions: 4813 1668 ERROR 6: IH5:::ID=360287970189640696: Dataset does not support the SetSpatialRef() method. journal: elapsed time (GEO-IN) [s]: 1.842 ERROR 6: IH5:::ID=360287970189640696: Dataset does not support the SetSpatialRef() method. journal: -- start X: 531420.000000 -- end X: 630720.000000 -- start Y: 5204460.000000 -- end Y: 5157360.000000 -- spacing X: 100.000000 -- spacing Y: -50.000000 -- width: 993 -- length: 942 -- epsg: 32610 journal: -- geo2rdr threshold: 1e-08 -- geo2rdr numiter: 25 -- baseband azimuth spectrum (0:false, 1:true): 0 -- flatten phase (0:false, 1:true): 0 -- remove phase screen (0: false, 1: true): 0 -- apply azimuth offset (0: false, 1: true): 0 -- apply range offset (0: false, 1: true): 0 -- nbands: 1 -- flag_apply_rtc (0:false, 1:true): 0 -- geocode memory mode: auto -- min. block size: 32MB -- max. block size: 256MB journal: -- array length: 942 -- array width: 993 -- number of block(s): 1 -- block length: 942 -- block width: 993 -- block size: 4MB journal: -- number of blocks: 1 -- block length: 942 -- -- starting geocoding journal: block: 0 journal: -- Actual DEM bounds used: -- Top Left: -122.604 47.0068 -- Bottom Right: -121.267 46.5437 -- Spacing: 0.000277778 -0.000277778 -- Dimensions: 4813 1668 ERROR 6: IH5:::ID=360287970189640705: Dataset does not support the SetSpatialRef() method. journal: elapsed time (GEO-IN) [s]: 1.659 ERROR 6: IH5:::ID=360287970189640705: Dataset does not support the SetSpatialRef() method. journal: -- start X: 531420.000000 -- end X: 630720.000000 -- start Y: 5204460.000000 -- end Y: 5157360.000000 -- spacing X: 100.000000 -- spacing Y: -50.000000 -- width: 993 -- length: 942 -- epsg: 32610 journal: -- geo2rdr threshold: 1e-08 -- geo2rdr numiter: 25 -- baseband azimuth spectrum (0:false, 1:true): 0 -- flatten phase (0:false, 1:true): 0 -- remove phase screen (0: false, 1: true): 0 -- apply azimuth offset (0: false, 1: true): 0 -- apply range offset (0: false, 1: true): 0 -- nbands: 1 -- flag_apply_rtc (0:false, 1:true): 0 -- geocode memory mode: auto -- min. block size: 32MB -- max. block size: 256MB journal: -- array length: 942 -- array width: 993 -- number of block(s): 1 -- block length: 942 -- block width: 993 -- block size: 4MB journal: -- number of blocks: 1 -- block length: 942 -- -- starting geocoding journal: block: 0 journal: -- Actual DEM bounds used: -- Top Left: -122.604 47.0068 -- Bottom Right: -121.267 46.5437 -- Spacing: 0.000277778 -0.000277778 -- Dimensions: 4813 1668 ERROR 6: IH5:::ID=360287970189640714: Dataset does not support the SetSpatialRef() method. journal: elapsed time (GEO-IN) [s]: 1.683 ERROR 6: IH5:::ID=360287970189640714: Dataset does not support the SetSpatialRef() method. journal: -- start X: 531420.000000 -- end X: 630720.000000 -- start Y: 5204460.000000 -- end Y: 5157360.000000 -- spacing X: 100.000000 -- spacing Y: -50.000000 -- width: 993 -- length: 942 -- epsg: 32610 journal: -- geo2rdr threshold: 1e-08 -- geo2rdr numiter: 25 -- baseband azimuth spectrum (0:false, 1:true): 0 -- flatten phase (0:false, 1:true): 0 -- remove phase screen (0: false, 1: true): 0 -- apply azimuth offset (0: false, 1: true): 0 -- apply range offset (0: false, 1: true): 0 -- nbands: 1 -- flag_apply_rtc (0:false, 1:true): 0 -- geocode memory mode: auto -- min. block size: 32MB -- max. block size: 256MB journal: -- array length: 942 -- array width: 993 -- number of block(s): 1 -- block length: 942 -- block width: 993 -- block size: 4MB journal: -- number of blocks: 1 -- block length: 942 -- -- starting geocoding journal: block: 0 journal: -- Actual DEM bounds used: -- Top Left: -122.604 47.0068 -- Bottom Right: -121.267 46.5437 -- Spacing: 0.000277778 -0.000277778 -- Dimensions: 4813 1668 ERROR 6: IH5:::ID=360287970189640723: Dataset does not support the SetSpatialRef() method. journal: elapsed time (GEO-IN) [s]: 1.684 ERROR 6: IH5:::ID=360287970189640723: Dataset does not support the SetSpatialRef() method. journal: s1_geocode_slc corrections computation time 0:00:00 (hr:min:sec) journal: s1_geocode_slc geocode prep time 0:00:01 (hr:min:sec) journal: s1_geocode_slc geocoding time 0:03:19 (hr:min:sec) journal: s1_geocode_slc QA meta processing time 0:00:22 (hr:min:sec) journal: s1_geocode_slc burst successfully ran in 0:03:44 (hr:min:sec) ```
cmarshak commented 5 months ago

Hey Scott,

Two Scott's are better than one. Especially two highly capable ones!

In regards to the error, my best guess is gdal incompatibility with the minor version 3.7--> 3.8 (though this is highly speculative; I have seen this issue in both DSWx-HLS and DSWx-S1 software). I saw bizarre errors documented in Proteus here. @gshiroma figured out that it had to with how the python gdal modified some of the write API (tiff driver in particular). The quick fix in the linked case was putting a cap on the environment.

Happy adventures!

scottstanie commented 5 months ago

You're right that the DEM could be different:

As far as the Dataset does not support the SetSpatialRef(), that's an annoyance that is on the internal github's issue list to remove, but is not indicative of a real workflow error.

Just checking- you have enabled: true for the "corrections" section of the YAML? Not sure if it would be a different of applying one of the corrections like "azimuth FM rate", which looks DEM-ish visually in the correction it applies

scottyhq commented 5 months ago

Thanks both for the follow up! It seems like a mismatch of DEM is the primary suspect for different outputs currently, so it would be great to get clarity on the DEM used for standard products from whoever is familiar with it.

I know that opera was using the "NISAR DEM", which was a version of the Copernicus DEM updated for use in NISAR processing. I'm not actually sure if/how different it is

Interesting, I've not yet heard of NISAR DEM... It would be great to document how it was updated, or even better just make the /home/compass_user/input_dir/dem.vrt public? If it's pointing at public COP30 data anyways, I can't think of why it shouldn't be?

If it's quick to check, I'd be curious if adding --output-format GTiff --output-type float32 would do. I don't think it would change the prominent mountain-ish features though

Just very small differences in the output above, as expected

Just checking- you have enabled: true for the "corrections" section of the YAML?

Yes, I just copied the config file from the ['metadata']['processing_information']['runconfig'] (see first comment in issue) which has the following:

    processing:
      correction_luts:
        azimuth_spacing: 0.028
        enabled: true
dbekaert commented 5 months ago

Hi @scottyhq, OPERA and ASFDAAC are in conversation to make the ancillaries available where possible (depends on licensing), or provide instructions where to find them, or instructions to re-create them. We will need to walk through the process and keep you posted once it's completed.