GlacioHack / xdem

Analysis of digital elevation models (DEMs)
https://xdem.readthedocs.io/
MIT License
143 stars 41 forks source link

[POC]: Nan padding for statistics when the blockwise coregistration fails on certain subdivisions #604

Open quinto-clara opened 2 weeks ago

quinto-clara commented 2 weeks ago

Context

The purpose of this ticket is to implement a failure indication in the returns of the blockwise function regarding the statistics. However, data storage does not seem to be intuitive, which is why we will include a code exploration phase in this ticket.

Study

Implementation

  1. Code Modification:

    We will modify the section of the code where the statistics are collected. For points that fail (i.e., those not present in chunk_meta), we will store NaN values for the relevant statistics.

    Here is the modified code:

            for i in range(self.subdivision):
            if i not in chunk_meta:
                statistics.append(
                    {
                        "center_x": np.nan,
                        "center_y": np.nan,
                        "center_z": np.nan,
                        "x_off": np.nan,
                        "y_off": np.nan,
                        "z_off": np.nan,
                        "inlier_count": np.nan,
                        "nmad": np.nan,
                        "median": np.nan,
                    }
                )
            else:
                statistics.append(
                    {
                        "center_x": points[cpt_in_chunk, 0, 0],
                        "center_y": points[cpt_in_chunk, 1, 0],
                        "center_z": points[cpt_in_chunk, 2, 0],
                        "x_off": points[cpt_in_chunk, 0, 1] - points[cpt_in_chunk, 0, 0],
                        "y_off": points[cpt_in_chunk, 1, 1] - points[cpt_in_chunk, 1, 0],
                        "z_off": points[cpt_in_chunk, 2, 1] - points[cpt_in_chunk, 2, 0],
                        "inlier_count": chunk_meta[i]["inlier_count"],
                        "nmad": chunk_meta[i]["nmad"],
                        "median": chunk_meta[i]["median"],
                    }
                )
                cpt_in_chunk +=1

Tests

  1. Unit Tests:

    • Write tests to verify that, for a point that fails (i.e., not present in chunk_meta), the values inlier_count, nmad, and median return NaN.
    • Test the case where all points succeed to ensure that values are correctly retrieved from chunk_meta.
  2. Integration Tests:

    • Test the functionality of the blockwise function as a whole to ensure that there are no regressions in other parts of the code.
    • Verify that the overall statistics are calculated correctly and that it is possible to handle NaN values without causing errors.

Documentation

/ estimation 5d

rhugonnet commented 5 days ago

OK! For the "Tests" section: we could also verify that apply() running on a chunk with NaNs saved in fit() indeed does not apply any transformation.

Another point: I am not sure that the current implementation of BlockwiseCoreg runs for a coregistration method other than a translation (as its implementation precedes those). A rotation like ICP or any BiasCorr method might make it fail. It would be nice to quickly double-check that (which is a simple as adding other methods to this list in the tests here: https://github.com/GlacioHack/xdem/blob/595acb9bf2ba150232cf161fc09f8747c84b5602/tests/test_coreg/test_base.py#L865

If it does not work, we could verify if the new structure or a slight change in structure would allow all types of methods to run (it's a slightly different topic, it may be independent if we only modify the statistics here, but could be slightly entangled too)... :thinking:

And, in any case, if these methods are not supported, we should temporarily add a NotImplementedError during BlockwiseCoreg.__init__ for those methods. I might do that quick test + error raising in a separate PR if I find the time later this week.