Bugfix phase_data unit conversion in correct step

mcgarth commented 3 years ago

This PR addresses the issue of phase_data unit conversion during the correct step. It is important that every correction algorithm receives the phase_data in the correct units (usually millimetres, but radians for the phase closure algorithm). By using the debug level log statements, I found that the conversion was not being consistently applied during the various correction steps and/or saved afterwards.

This PR introduces the following changes:

ensure nan_and_mm_convert is applied whenever an shared.Ifg object is opened.
ensure corrected phase data is written to disk at the end of each correction step (using Ifg.write_modified_phase) and before shared.Ifg object is closed.
update the tiled phase_data that is saved to numpy files after each correction step.

I have also:

made some modification to the correction file plotting script: it now makes use of the shared.Ifg class for reading data and mm conversion.
made some minor changes/improvements to the other plotting scripts
refactored the APS module, based on what @adeane-ga started. Now, the interferogram correction layer is saved to disk and subtracted from interferograms directly (previously, the correction was applied to incremental timeseries and then back-converted to interferograms). This high-level module structure could be copied for the Orbital and DEM Error modules.

adeane-ga commented 3 years ago

I've also just noticed some inconsistent values for phase closure between the documentation and the default config file in branch:

From input_parameters.conf:

https://github.com/GeoscienceAustralia/PyRate/blob/dde59c67bc8eb637012dfa5a80e3a535896c8ec2/input_parameters.conf#L144-L158

From documentation:

richardt94 commented 3 years ago

I have run this on the ARD stack for T147D_F32S_S1A. The output (timeseries linear_rate estimate) before this PR looks like this: And after looks like this:

The masking that is present before the PR (from unwrapping errors picked up by phase closure) is not present after the PR changes - interestingly, the masking is present in the interferograms in temp_mlooked_dir, but not in the timeseries output. Al and I are continuing to investigate the cause of this, and I think we should hold off on merging the PR until we get to the bottom of it.

mcgarth commented 3 years ago

Does calling nan_and_mm_convert always do nothing if the ifg is already in the correct units? (i.e. is metadata on conversion status stored to the geotiff and used by this function)

see the function shared.convert_to_mm; it applies the conversion depending on the DATA_UNITS tiff tag of the ifg. That tiff tag was added during prepifg step.

Are there specific examples besides the orbit fit where the inconsistent conversion will result in incorrect output?

The issue that I found was that the temp_mlooked_dir ifgs were on disk in units of radians. Some corrections were converting to millimetres, but not all - not good when all (except phase_closure) are expecting millimetres. And none were writing the new data after the unit conversion. I think that is the major change in this PR.

It seems that the sequence of calls is now almost always ifg.open() followed by nan_and_mm_convert(ifg, params). Would it make more sense to just include the conversion in the open method, or do the conversion and save it in prepifg and just refer to the converted ifg from then on?

Doing the unit conversion once during prepifg makes sense to me. Then just converting data back to radians once just for phase_closure correction (but remembering to only save after re-converting to millimetres. I think this is a separate ticket for future work.

adeane-ga commented 3 years ago

To supplement @richardt94 's detailed look into the code, I can here summarise how PyRate is performing on the test data set (Mexico) for a broad qualitative assessment.

I am happy now to merge this PR once we discuss the results of what @richardt94 finds in regards increased masking when the corrections were not being applied to Numpy files.

Summary:

In reference to the Mexico data set, we are seeing no visual degradation in PyRate performance and potentially subtle improvement in comparison to GPS time series.
The quantity of masking due to phase closure is generally making sense (no large scale over or under masking).
At a fine scale, there seems to be some under masking resulting from the inclusion of cells that should be removed (not calculated) in the time series step based on ts_pthr.
As discussed in above comment, the magnitude of the DEM error makes more sense now due to the unit conversion. Though work needs to be done to understand if our IFG corrections are working appropriately. This needs to be done with a synthetic DEM dataset to be sure.

Future work for dedicated PRs:

Investigate the linear samples issue and why our results include cells with less than our defined threshold in ts_pthr.
Confirm our DEM error calculation and correction is working appropriately by implementing a synthetic data set approach.

The next two figures compare PyRate validation between an older develop branch before recent changes and the current PR branch. This is the Crop-A data.

TOP: old develop branch from early 2021 BOTTOM: current PR branch

oldDEVbranch

PRbranch_PhsClosOn_PNGs

Next figure we see the same dataset with the Network method used for orbital fitting instead of the independent method. Although we wouldn't normally try and correct for orbit with such a small area and Sentinel-1 data, this still indicates that we should be cautious of determining the best method for particular data sets. Here we can see the Network method overfitting to express a ramp in the resulting displacement map. There is potentially a subtle ramp seen in the above images produced from the Independent method of orbit fitting.

PRbranch

The Next set of figures are full frame Mexico PyRate processing with changes to the closure_thr parameter to inform us on how the Phase Closure Masking is operating. Click on the images to get a gray background to improve visual inspection.

Phase closure settings: closure_thr: 0.1 (TOP), 0.5 (MIDDLE), 1.0 (BOTTOM) ifg_drop_thr: 0.5 min_loops_per_ifg: 2 max_loop_length: 4 max_loop_redundancy: 5

linear_rate_PhsClosThr01

linear_rate_PhsClosThr05

linear_rate_PhsClosThr10

Next figure is the GPS-PyRate comparison for the full frame of Mexico data.

full_frame

Next figure shows the linear samples with range on left. Even with a time series setting to only operate on cells with 6 or more samples (ts_pthr: 6), we see there are still cells included that are below 6 (e.g. red=2 samples).

richardt94 commented 3 years ago

I am now convinced the PR is doing the right thing, after isolating the change in masking pattern down to line 192 in correct.py, where save_numpy_phase is now being called (it was not before). By checking the numpy phase tiles, I was able to confirm that without this call, none of the masking effects from the phase closure step propagate into the tiles. An example is below: PR_figure

The only change that phase closure should make to an ifg is to mask pixels, so the two tiles in the example differ only by the masking pattern.

If the APS is enabled, the lack of masking has downstream effects - the APS uses the tiles as input, so it is receiving the wrong input data and fitting a possibly incorrect atmospheric model. This is then subtracted from the complete ifg (sourced from the tiff file which was updated after phase closure, and so is masked), which is why we were seeing masking in the output before this PR.

Approved to merge once we figure out why the test is failing in the Py3.8 CI environment.

mcgarth commented 3 years ago

The final commit here (0a5363e) is needed for the mpi/serial/multiprocess tests to pass. It is unknown why the changes in this PR cause these tests to fail with Mexico cropA data and requires further investigation. See Issue https://github.com/GeoscienceAustralia/PyRate/issues/355

GeoscienceAustralia / PyRate

Bugfix phase_data unit conversion in correct step #347