ASFHyP3 / hyp3

A processing environment for HyP3 Plugins in AWS.
BSD 3-Clause "New" or "Revised" License
36 stars 8 forks source link

Option to include tropospheric correction data in INSAR_ISCE output #1416

Open asjohnston-asf opened 1 year ago

asjohnston-asf commented 1 year ago

https://github.com/dbekaert/RAiDER contains tools to calculate tropospheric corrections for radar. A calcDelaysGUNW utility was recently added to write additional tropospheric data layers into an existing GUNW netcdf product:

raider.py ++process calcDelaysGUNW --model GMAO S1-GUNW-D-R-013-tops-20221223_20221129-142432-00127W_00038N-PP-8387-v2_0_6.nc

Possible values for the --model parameter are documented at https://github.com/dbekaert/RAiDER/blob/dev/docs/WeatherModels.md

This feature request is to add a new tropospheric_model parameter to INSAR_ISCE jobs with possible values of none (the default), HRRR, and GMAO:

{
  "job_type": "INSAR_ISCE",
  "job_parameters": {
    "granules": ["S1B_IW_SLC__1SDV_20210723T014947_20210723T015014_027915_0354B4_B3A9"],
    "secondary_granules": ["S1B_IW_SLC__1SDV_20210711T014947_20210711T015013_027740_034F80_D404"],
    "tropospheric_model": "HRRR"
  }
}

A job requested without specifying a model, or with a model of none, should produce the same output as currently provided. If a model is specified, the output netcdf product should be the output from running calcDelaysGUNW using that model, including the two new tropospheric data layers:

/science/grids/corrections/external/troposphere/troposphereWet
/science/grids/corrections/external/troposphere/troposphereHydrostatic
asjohnston-asf commented 1 year ago

@cmarshak @jhkennedy does this match your understanding of adding the tropospheric correction data to GUNW products generated with HyP3?

I'm open to other ways to specify "don't include tropospheric correction", but this seems like the straightforward approach.

cmarshak commented 1 year ago

This is great.

asjohnston-asf commented 1 year ago

Are HRRR and GMAO the right model options to expose? Any others you'd like to see available?

Notably, HRRR is only available over the continental US and only since July 2016. I imagine we'll ultimately want HyP3 to reject HRRR jobs without HRRR coverage as soon as a user tries submitting them? I imagine for an initial implementation we'll skip that check and let the job fail when it reaches the calcDelaysGUNW step.

cmarshak commented 1 year ago

This is a @dbekaert question.

I believe we also want HRES.

A nice to have would be some type of initial validation since the first step is so long running (generating a GUNW).

asjohnston-asf commented 1 year ago

@cmarshak We're close to releasing the first iteration of a new weather_model option for INSAR_ISCE jobs. I expect this will be sufficient for your team to iterate on RAiDER development, letting you run on any interferograms of your choosing.

We picked 200 GUNW interferograms from ASF archive at random and attempted to regenerate them with each weather model, here are the results:

cmarshak commented 1 year ago

GMAO works for recent scenes, but fails for older scenes with errors such as FileNotFoundError: [Errno 2] No such file or directory: b'/home/raider/weather_files/GMAO_2017_02_20_T00_42_52.nc'. We're seeing scenes from 2016-17 generally fail, scenes from 2019-present generally succeed, and scenes from 2018 are about 50/50. Is there a known availability issue for older GEOS FP data? We could consider adding an up-front check that the input scenes have coverage

@jlmaurer @bbuzz31 thoughts on this?

If there is temporal coverage issues, we should add a check as that's easy (no GIS).

HRRR works for scenes inside the US, and fails with ValueError: not enough values to unpack for scenes outside the US We could consider adding up-front validation that the input scenes have coverage

Yes - this is a bit annoying(?) for hyp3 in terms of validation. But since Raider doesn't run until ISCE2 step is complete - this check would be very valuable to make sure we are not waiting around for hrrr to fail. But this (as it seems) would be a nice to have - unless its relatively easy for you all to implement.

HRES always fails with User 'anonymous' has no access to services/mars. Would you prefer we remove this as an option until the underlying access issues are resolved?

@dbekaert wants to have this working - not sure if this is raider issue, permission issue, or hyp3 issue. Please clarify.

jacquelynsmale commented 1 year ago

HRES always fails with User 'anonymous' has no access to services/mars. Would you prefer we remove this as an option until the underlying access issues are resolved?

@dbekaert wants to have this working - not sure if this is raider issue, permission issue, or hyp3 issue. Please clarify.

@cmarshak Since HRES requires a special license agreement, we currently cannot access the dataset, since we do not have access to services/mars. I guess this is somewhat a hyp3 issue, since we don't have the proper permissions.

jhkennedy commented 1 year ago

@dbekaert is working on getting the license and I believe we won't be able to redistribute products for the first 24h after HRES data becomes available

jacquelynsmale commented 1 year ago

@cmarshak We've released the first iteration of including a new weather_model option for INSAR_ISCE jobs with hyp3 v2.25.0. Tropospheric corrections can now be included in GUNW netcdf products by selecting a weather_model parameter to determine which model is used in delay estimations. This parameter accepts values of None, HRRR, HRES, or GMAO, with each value corresponding to its respective weather model. These jobs can be submitted as:

import hyp3_sdk

hyp3 = hyp3_sdk.HyP3('https://hyp3-a19-jpl.asf.alaska.edu')
job = [
        {
            'job_type': 'INSAR_ISCE',
            'job_parameters': {
                'granules':['S1B_IW_SLC__1SDV_20210723T014947_20210723T015014_027915_0354B4_B3A9'],
                'secondary_granules': ['S1B_IW_SLC__1SDV_20210711T014947_20210711T015013_027740_034F80_D404'],
                'weather_model': 'HRRR',
            },
        }
]
hyp3.submit_prepared_jobs(job)

If weather_model=None or weather_model is not specified, no additional tropospheric data layers will be included in the final netcdf product.

Final output netcdf products include layers created by running RAiDER'scalcDelaysGUNW using the specified weather model:

/science/grids/corrections/external/troposphere/troposphereWet
/science/grids/corrections/external/troposphere/troposphereHydrostatic
cmarshak commented 1 year ago

This looks fantastic. Unbelievable work!

Will this also work with INSAR_ISCE_TEST jobs? This would be ideal since we have a lot of testing and the production pipeline still triggers delivery to ASF.

jhkennedy commented 1 year ago

@cmarshak yes, this will work for both INSAR_ISCE and INSAR_ISCE_TEST. :-D

cmarshak commented 1 year ago

Out of curiousity - suppose for testing we are trying to submit 2 GUNWs but with different weather models within the same job - they will come out with the same ID presumably.

Not sure about implications for ingestion, but flagging this here temporarily.

@dbekaert @jhkennedy @asjohnston-asf

cmarshak commented 1 year ago

@jacquelynsmale - can we also include ERA5 as well. This will be good for inspecting outputs as this is the most common weather model.

On this note - is there a good way to cancel jobs (via hyp3-sdk) in case the jobs spin for 24+ hours due to data latency?

More on this note - is there any idea of what kind of instances / costs per hour are related to the RAIDER step?

jacquelynsmale commented 1 year ago

@cmarshak, we can definitely add that as an option. @forrestfwilliams and I have started adding that as an option for HyP3 and RAiDER as pull requests, and we'll update you when we get those merged. As for canceling jobs, that currently is not an option via the hyp3-sdk. I'm unsure about the instances/costs per hour related to the RAiDER step. @asjohnston-asf or @jhkennedy, any insight there?

jhkennedy commented 1 year ago

@cmarshak we've not done a large-scale test but for the small number of jobs we've run (HRRR and GAMO), it looks like the RAiDER step increases the overall runtime by 2-4%.

Since we're giving the same resources to DockerizedTopsApp and RAiDER, that directly translates to 2-4% increase in cost.

We could do a more aggressive cost optimization for the RAiDER step, but with that little increase in cost it's not worth the developer time right now.

cmarshak commented 1 year ago

I do think the raider step would benefit from some instance optimization in the not too distant future since raider requires significantly less compute and disk. Particularly since there are some risks with latency, this could help a lot with cost when we run things en masse. I think a coarse instance optimization would be an easier short-term, cost-saving measure than downloading a weather corpus and making that gel with the existing raider step function, though balancing these prospective paths is something we should discuss at the next hyp3 meeting.

I would be happy to help with this in the coming months.