mcgarth commented 4 years ago

This PR implements new functionality to calculate the rate for each pixel as the linear regression of the cumulative displacement time series.

The functionality is based on the scipy.stats.linregress function: https://docs.scipy.org/doc/scipy-1.4.1/reference/generated/scipy.stats.linregress.html?highlight=linregress#scipy.stats.linregress

Input: cumulative displacement time series tscuml

Output:

linrate = gradient of best fitting line
intercept = y-intercept of best fitting line at t = 0
rsquared = R^2 goodness of fit statistic [0 ->1]
error = standard error of the best fitting line w.r.t the observations
samples = number of observations used to constrain the best fit line

All five outputs are saved as linear_*.tif files after the merge step and optionally as linear_*.npy files if savenpy = 1. Additionally linrate, rsquared and error are saved as png and kml files.

The Linrate functionalty is executed as a non-optional part of the timeseries algorithm (it will be calculated if tscal = 1)

This PR implements unit tests at the pixel and array level (added to test_timeseries.py) and regression tests in test_mpi_vs_multiprocess_vs_single_process.py

codecov-commenter commented 4 years ago

Codecov Report

Merging #285 into develop will increase coverage by 0.66%. The diff coverage is 62.85%.

@@             Coverage Diff             @@
##           develop     #285      +/-   ##
===========================================
+ Coverage    84.92%   85.58%   +0.66%     
===========================================
  Files           26       26              
  Lines         3336     3628     +292     
  Branches       516      620     +104     
===========================================
+ Hits          2833     3105     +272     
- Misses         404      417      +13     
- Partials        99      106       +7

Impacted Files	Coverage Δ
pyrate/merge.py	`16.56% <3.70%> (-1.94%)`	:arrow_down:
pyrate/core/timeseries.py	`90.73% <100.00%> (+2.38%)`	:arrow_up:
pyrate/core/config.py	`91.87% <0.00%> (-1.61%)`	:arrow_down:
pyrate/configuration.py	`97.05% <0.00%> (+0.96%)`	:arrow_up:
pyrate/core/prepifg_helper.py	`96.33% <0.00%> (+1.15%)`	:arrow_up:
pyrate/core/orbital.py	`94.18% <0.00%> (+1.59%)`	:arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update cb73276...f99a038. Read the comment docs.

mcgarth commented 4 years ago

I reviewed this PR with a new data stack and here are my observations:

The program produces and saved all the five expected outputs (linearrate, error, intercept, samples, rsquared) and also saved optional linear*.npy files to the disc.

Concern to results analysis between linear and stack rate. This needs further discussion understanding the performance analysis between two approaches.

At the moment linear_rate result shows higher deformation range limits compared to stack_rate. Similarly for linear_error which shows a greater error limit as compare to stack_error. I could expect differences between both approaches but greater than one order of magnitude in case of linear rate and error calculation don't sound reasonable. This may need a further investigation on performance of algorithm.

But I approve PR as it generates all the required files as expected and identifying structures similar to stack calculation.

Thanks for the comments @chandra2ga. I think we can do some really in-depth performance evaluation of both stack_rate and linear_rate once we have some GPS data available for comparison

mcgarth commented 4 years ago

I've performed a full run-through PyRate using the mg/linrate branch and can confirm that all works as expected. I've used Camden RSAT2 data for this test.

I can also confirm that the linear_rate result is noiser compared to stack_rate result as mentioned by Chandra (though not that much different as in Chandra's test case. I also realise that the linear_rate result considers more pixels compared to the stack_rate result (see images below, colour scales are +/- 273 and +/-248, respectively). The difference may be related to the masking of pixels (r-value) and the handling of uncertainties of the input data (i.e. the cumulative displacements).

Future improvement: The scipy function 'linregress' is used to calculate the velocity + uncertainty for each pixel. I understand that 'linregress' does not support inputting uncertainties of the y-data (i.e. the cumulative displacements), not talking about temporal correlations using a full varcov matrix. You may want to add consideration of uncertainties and temporal correlations in the linrate algorithm to the list of tickets.

stack_rate.png:

linear_rate.png:

Thanks for this review @tfuhrmann . Your visual comparison gives me more confidence in the software - we are able to resolve known deformation signals. Interesting that linear_rate appears to resolve the deformation zones better than stack_rate. But yes there seems to be more noise in the linear_rate result. No masking has been applied in the algorithm; this could be done during post-processing by using the linear_samples or linear_rsquared layers as a mask. Your idea to add time series uncertainties to the linear regression is a good one. The only problem is we don't currently compute time series uncertainties!

GeoscienceAustralia / PyRate

Linrate: Linear regression of cumulative displacement time series #285

Codecov Report