insarlab / PySolid

A Python wrapper for solid Earth tides
GNU General Public License v3.0
61 stars 7 forks source link

Inconsistency between overlapping frames #54

Open sssangha opened 1 year ago

sssangha commented 1 year ago

When using calc_solid_earth_tides_grid through MintPy for two separate runs involving overlapping frames, I notice unexpected behavior.

Specifically, I would expect the overlapping region between the two to be consistent after referencing epochs in the stack to the same reference date and point, but I instead find a non-zero, sub-mm level mean offset (~0.1 mm, and a range of ~0.5mm) and stdev (~0.33 mm): image

Here is the residual within the overlapping region: image

Please let me know how I can help to debug, if I should clarify anything, and if also perhaps it would be best to share the input files I used on the MintPy end. Many thanks

yunjunz commented 1 year ago

Hmm, I would expect the overlapping region having the same SET as well. Your 2nd plot shows this residual is a ramp long the range direction and almost centered around zero. The 0.5 mm is really small, I am wondering if the two frames have the geometry file from the same date as well?

And similarly, could you try to concatenate the incidence and azimuth angle, and check their overlapping regions too?

sssangha commented 1 year ago

Hello @yunjunz upon closer inspection, it looks like the geometry files indeed are slightly off between the two frames, so I wonder if this may Screenshot 2023-03-22 at 7 12 36 PM Screenshot 2023-03-22 at 7 13 50 PM be associated with resampling of the geometry files themselves:

Regardless of this though, as a rough sanity check I hacked the solid_earth_tide.py script through mintpy to directly inspect the outputs of calc_solid_earth_tides_grid (i.e. the direct E, N, and U outputs). So i.e. I took this section of the code and overrode it to iteratively pass the E, N, and U components into separate cubes. E.g. for this test I hacked this section as so:

        #ts_set[i,:,:] = (  set_e * unit_vec[0]
        #                 + set_n * unit_vec[1]
        #                 + set_u * unit_vec[2])
        ts_set[i,:,:] = (set_e)

What I found after doing so was still a shift between the frames (see the E component below). I understand this is not to be expected, and am struggling to pinpoint the source of the discrepancy: Screenshot 2023-03-23 at 11 46 28 AM

yunjunz commented 1 year ago

I guess this could be due to the step_size used in the SET calculation for speedup. Could you test the set_e again by setting step_size = 1 here?

If confirmed (consistent between overlapping frames with step of 1), it means we should adjust the step/grid selection and resize code .

sssangha commented 1 year ago

@yunjunz , your suggested test works! Screenshot 2023-03-27 at 1 50 23 PM

So looks like we would have to adjust the step as you advised. Thanks again for helping to debug!

sssangha commented 1 year ago

Hello @yunjunz , as an amendment to my previous comment I manually re-referenced the two frames to the same coordinate after the fields were generated. To clarify though, is this necessary or should the outputs be agnostic to a reference point? I see the output SET.h5 cubes themselves had different recorded reference points, but it isn't clear whether or not they were applied to the output array itself.

Because if I don't re-reference the grids manually, I get a constant offset as so, event after setting the step_size to 1: Screenshot 2023-03-27 at 2 37 44 PM

sssangha commented 1 year ago

If subtracting 1 from the dimensions like this intentional?

yunjunz commented 1 year ago

If subtracting 1 from the dimensions like this intentional?

Yes, this was intentional, to ensure the generated SET from the fortran program to have the size of length x width.

yunjunz commented 1 year ago

To clarify though, is this necessary or should the outputs be agnostic to a reference point? I see the output SET.h5 cubes themselves had different recorded reference points, but it isn't clear whether or not they were applied to the output array itself.

Assuming this is still set_e (not the SET projected into the LOS direction), it should not need a reference point, as everything is absolute. In mintpy, the referencing in time/space while correcting for SET is done on the fly.

Because if I don't re-reference the grids manually, I get a constant offset as so, event after setting the step_size to 1:

Hmm... Is the magnitude of this offset equivalent to shifting one or a few pixels?

yunjunz commented 1 year ago

[Thoughts In Progress] until we figure out the offset without spatial referencing.

With these tests you have done here, which is super helpful, I think we should adjust the step as below, e.g. given a shape of 1033 x 2526 with a step size of 20:

  1. coarse grid starting coord should be (10, 10), instead of (0, 0).
  2. number of steps should be ceil(1033 / 20) = 52 and ceil(2526 / 20) = 127, instead of using rint()
  3. resize the coarse resolution to the original resolution with a shape of (52 x 20, 127 x 20)
  4. crop the original resolution to the original shape of (1033, 2526)
sssangha commented 1 year ago

Hmm... Is the magnitude of this offset equivalent to shifting one or a few pixels?

Apologies, not sure I understand. It's a vertical offset that doesn't seem to translate laterally. And yes I did manually hack it still to use set_e

Thanks for the advice here! I'm working to test out your suggestions. Quick point of clarification though, how would you be defining the starting coord using the step_size in this case? Would it be like:

lat0 = float(atr['Y_FIRST'])
lat0 += float(atr['Y_STEP']) * step_size
lon0 = float(atr['X_FIRST'])
lon0 +=  float(atr['X_STEP']) * step_size

And also, how would this work if the specified step_size = 1? You wouldn't have a coarse grid, but would still shift the starting lat/lon by a pixel, unless you introduce an if/else statement to prevent this?

number of steps should be ceil(1033 / 20) = 52 and ceil(2526 / 20) = 127, instead of using rint()

So I would adjust this e.g. as so?:

    length = np.ceil(int(atr['LENGTH']) / num_step).astype(int)
    width  = np.ceil(int(atr['WIDTH'])  / num_step).astype(int)

crop the original resolution to the original shape of (1033, 2526)

This would already be captured by this section, right?

sssangha commented 1 year ago

@yunjunz , I just realized something. The fact that we don't use the same datetime info between these frames should play a role here, no? E.g. refer to this section where we feed in this info to the function.

The reported center time will be different for adjacent frames.

sssangha commented 1 year ago

Ok, so it looks like I may be onto something with my hunch above. So for context we are working to integrate these SET estimates into our ARIA GUNW products as we had discussed a few months ago. With that in mind, after fixing the times between adjacent frames, I'm finally above to resolve this apparent discrepancy between frames: Screenshot 2023-03-28 at 8 58 46 PM

So in other words, it looks like the specified timing somehow introduces a vertical offset. It's not obvious to me why this is so.

yunjunz commented 1 year ago

I agree with you, that's the timing difference along the orbit! If it's not accounted for, this timing variation will introduce a small ramp along the azimuth direction for a long track (I described this in section III.E.2 in my 2022 TGRS paper, after Dr. Dennis Milber brought it up in our email communications); and for the multiple frames stitching, as in your case here, it looks like a jump in our current implementation!

I intended to write a swath mode, in addition to the current grid and point mode, for this scenario. The implementation will be much easier if we have a true API-like talking between the python and fortran code (https://github.com/insarlab/PySolid/issues/2), instead of the current talking via text file. Therefore, I have not done it yet.

Note that adding a constant offset, or setting the same UTC time, fixes the "look" (to be continuous), but the ramp along azimuth still exists, which we have been ignoring so far.

yunjunz commented 1 year ago

And also, how would this work if the specified step_size = 1? You wouldn't have a coarse grid, but would still shift the starting lat/lon by a pixel, unless you introduce an if/else statement to prevent this?

Good point. Then we should still use the precise corner coordinates of the corner pixels, i.e. the UL pixel's UL corner, UR pixel's UR corner, LL pixel's LL corner and LR pixel's LR corner. The adjusted procedure should be, e.g. given a shape of 1033 x 2526 with a step size of 20:

  1. coarse grid starting coordinates: Y_FIRST and X_FIRST
  2. number of steps should be ceil(1033 / 20) + 1 = 53 and ceil(2526 / 20) + 1 = 128
  3. resize the coarse resolution to the original resolution with a shape of (52 x 20, 127 x 20), not (53 x 20, 128 x 20).
  4. crop the original resolution to the original shape of (1033, 2526)

#

crop the original resolution to the original shape of (1033, 2526). This would already be captured by this section, right?

That does not capture the cropping, at least not properly accounting for the extra pixels falling into the last partial coarse step.

sssangha commented 1 year ago

I intended to write a swath mode, in addition to the current grid and point mode, for this scenario. The implementation will be much easier if we have a true API-like talking between the python and fortran code (#2), instead of the current talking via text file. Therefore, I have not done it yet.

To clarify though, how do you envision this swath mode? Would you increment through multiple frames individually? From what I understand, merged products passed either through ISCE or ARIA only report one time corresponding to the center of the scene, so how would you account for this change in timing along track other than making some assumptions or perhaps recording/passing sensing start/stop times?

Note that adding a constant offset, or setting the same UTC time, fixes the "look" (to be continuous), but the ramp along azimuth still exists, which we have been ignoring so far.

So for the aforementioned application involving embedding SET within the GUNW, could you please advise on my current plan of action, specifically on whether or not this sounds good on a high level? We compute SET as downsampled metadata layers for 4 height levels within the GUNW for each individual frame, for which of course different sensing times for the center of the scenes are captured and passed. This leads to this apparent offset which I "correct" by sequentially stitching the downsampled layers (before interpolation) in cases multiple frames are involved (i.e. estimate the integer offset in areas of overlap at incrementally apply it). This is captured in this ARIA-tools PR. Also note following our discussions, we are not passing reference and secondary estimates of SET directly, as opposed to inverting the differential fields. Marin and myself have made adjustments to prep_aria to reflect this through this branch.

As far as commits to pysolid, I will issue a PR capturing your most recent suggestions regarding adjustments to the number of steps. Perhaps from there we could perhaps gameplan how a swath mode would work, depending of course on our bandwidth! What do you think?

Again thanks for your ongoing help and advice, myself and the team really appreciate it!

yunjunz commented 1 year ago

To clarify though, how do you envision this swath mode? Would you increment through multiple frames individually? From what I understand, merged products passed either through ISCE or ARIA only report one time corresponding to the center of the scene, so how would you account for this change in timing along track other than making some assumptions or perhaps recording/passing sensing start/stop times?

The start/stop sensing time information is available from ISCE (I am not sure about this detail in ARIA products), which can be used to generate the pixel-wised 2D matrix for the timing, similar to what we have for incidence angle. We could have a function in pysolid or mintpy to generate this.

On the implementation side, I would imagine we have a Python API from the Fortran code, e.g. solid(dt_array, lat_array, lon_array), where the input is arrays of datetime, lat/lon and the output is arrays of SET_E/N/U. The current solid_grid/point() are special case for this.

yunjunz commented 1 year ago

As far as commits to pysolid, I will issue a PR capturing your most recent suggestions regarding adjustments to the number of steps. Perhaps from there we could perhaps gameplan how a swath mode would work, depending of course on our bandwidth! What do you think?

I agree.

Your plan of action looks good to me, except for one detail: the output of pysolid is already interpolated, so I would leverage that, and stitch the interpolated SET in ARIA-tools, instead of the downsampled one.

sssangha commented 1 year ago

Your plan of action looks good to me, except for one detail: the output of pysolid is already interpolated, so I would leverage that, and stitch the interpolated SET in ARIA-tools, instead of the downsampled one.

Great, but to clarify I meant to stitch the downsampled products intentionally for the sake of efficiency/speed. I.e. current interpolation procedure takes ~a few secs or so depending on the footprint of the image, so it would save a great deal of time to interpolate after stitching. Unless you would recommend otherwise?

The start/stop sensing time information is available from ISCE (I am not sure about this detail in ARIA products), which can be used to generate the pixel-wised 2D matrix for the timing, similar to what we have for incidence angle. We could have a function in pysolid or mintpy to generate this.

Are you referring to this function?

yunjunz commented 1 year ago

Great, but to clarify I meant to stitch the downsampled products intentionally for the sake of efficiency/speed. I.e. current interpolation procedure takes ~a few secs or so depending on the footprint of the image, so it would save a great deal of time to interpolate after stitching. Unless you would recommend otherwise?

I don't have a strong opinion on this: code simplicity and efficiency are both important. Your call.

Are you referring to this function?

Yes.