Closed basaks closed 4 years ago
Merging #288 into develop will decrease coverage by
1.33%
. The diff coverage is65.69%
.
@@ Coverage Diff @@
## develop #288 +/- ##
===========================================
- Coverage 84.50% 83.16% -1.34%
===========================================
Files 26 26
Lines 3413 3498 +85
Branches 534 547 +13
===========================================
+ Hits 2884 2909 +25
- Misses 429 491 +62
+ Partials 100 98 -2
Impacted Files | Coverage Δ | |
---|---|---|
pyrate/default_parameters.py | 100.00% <ø> (ø) |
|
pyrate/merge.py | 15.25% <9.52%> (-0.14%) |
:arrow_down: |
pyrate/main.py | 23.25% <20.00%> (+23.25%) |
:arrow_up: |
pyrate/core/logger.py | 36.36% <33.33%> (+1.06%) |
:arrow_up: |
pyrate/core/stack.py | 75.00% <40.00%> (-22.81%) |
:arrow_down: |
pyrate/prepifg.py | 50.78% <40.00%> (-0.81%) |
:arrow_down: |
pyrate/core/timeseries.py | 77.24% <43.18%> (-13.49%) |
:arrow_down: |
pyrate/conv2tif.py | 94.44% <66.66%> (ø) |
|
pyrate/core/shared.py | 93.88% <76.47%> (-0.43%) |
:arrow_down: |
pyrate/configuration.py | 91.87% <78.94%> (-4.25%) |
:arrow_down: |
... and 13 more |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 4b8ae96...986e594. Read the comment docs.
At the user level, I am happy with the changes in this PR and it implements the user functionality requested by the team. I have pushed a few commits on top of the branch that make some minor house-keeping changes. The biggest change in these commits is deprecating the
tscal
variable, which this PR makes redundant.
I can see this is done.
I did some numeric comparison of this PR against
develop
, by comparing summary statistics of output geotiffs (gdalinfo -stats
) after running the fullworkflow
against the unit test data. The results are documented in this word doc. In summary, there are differences in the mean and std-dev statistics. The worst is a difference in the 6th decimal place (of a millimetre), which I think is acceptable given that the measurement precision of InSAR is no where near 0.000001 millimetres! However, an investigation in to why these differences are occurring for this PR is recommended please @basaks .
These minor differences are introduced due to us splitting the correct
and timeseries
and stack
steps. At the end of the correct
step we save the params on disc, and reuse the saved params during timeseries
and stack
. This introduces some floating point errors, changing the outputs slightly.
I also ran a benchmark to test the improvements made by parallelising the spatial and temporal filters. For this benchmark on
gadi
I used 153 Mexico ifgs, 128Gb RAM andmpirun -n 16 pyrate workflow ...
. Withapsest=0
, runtimes were 5m 33s and 5m 24s fordevelop
andsb/step4-refactor-process
respectively. Withapsest=1
, runtimes were 9m 40s and 8m 14s fordevelop
andsb/step4-refactor-process
respectively. This represents an improvement in efficiency of ~15%. Well done @basaks!
Allow me to elaborate on this. Using 16 cpus, the improvement in the APS correction step alone is from (9.40-5.33)m = 247s to (8.14-5.24)m = 170s, thus making an improvement of 247s/170s ~ 45%.
There is more scope to improve by using numpy arrays instead of dicts to gather outputs of each row.
Minor comments:
- when running using MPI, every process writes every log message. Is there a way to suppress this, so that only the control process writes log messages?
The MPI logging used to log only the master process in 2017, and it was later changed to log messages from all processes. I will add that change back in for the on screen log. Is it desirable to log every process log in the file logger?
- It is not obvious to the user that VCMT is saved to disk (it is buried in
correction.params
). Suggest saving the VCMT to a numpy array filevcmt.npy
Done!
All other issues raised addressed!
The additional changes by @basaks have addressed all the issue I raised!
I re-did my numeric comparison tests for one run (apsest=1
, cohmask=1
) - the results were identical. So saving VCMT to disk as vcmt.npy
indeed did not change a thing as suspected by @basaks. I am happy with the explanation for the changes at 6 d.p. level given by @basaks above, and accept them. They are only fractions of a millimetre in magnitude and won't trouble us
process
renamed tocorrect
, all references and tests updated.correct
does not computetimeseries
orstack
ingtimeseries
andstack
commands.cohfilelist
in the config file. If the list is present, multilooked/cropped coherence files are saved to disk, regardless of whether coherence masking is on or off (parametercohmask
in config file).tscal
config parameter, that was used previously to control time series execution. This is not needed anymore now there is atimeseries
step invokable on the command line.