ACCESS-Cloud-Based-InSAR / DockerizedTopsApp

Apache License 2.0
21 stars 2 forks source link

Update to python 3.10+ #162

Open cmarshak opened 7 months ago

cmarshak commented 7 months ago

Use the latest release of ISCE2 and allow for environments with python 3.9 - 3.11 (inclusive) in order to help make package management slightly easier in the long run. EOL for a python version is scheduled for 5 years after its initial release.

Note: when I resolve the environment on a linux machine, python 3.11 is used.

Pysolid does not yet support 3.12 (issue ticket). Below is the the command line output trying to resolve the environment with python 3.12:

create env topsapp_env
  /home/runner/micromamba-bin/micromamba create -n topsapp_env -y --log-level warning python=3.12 -f /home/runner/work/DockerizedTopsApp/DockerizedTopsApp/environment.yml
  error    libmamba Could not solve for environment specs
      The following packages are incompatible
      ├─ pysolid is installable with the potential options
      │  ├─ pysolid 0.1.2 would require
      │  │  └─ python_abi 3.6.* *_cp36m, which can be installed;
      │  ├─ pysolid [0.1.2|0.2.0|0.2.1|0.2.2|0.2.3] would require
      │  │  └─ python_abi 3.7.* *_cp37m, which can be installed;
      │  ├─ pysolid [0.1.2|0.2.0|...|0.3.2] would require
      │  │  └─ python_abi 3.8.* *_cp38, which can be installed;
      │  ├─ pysolid [0.1.2|0.2.0|...|0.3.2] would require
      │  │  └─ python_abi 3.9.* *_cp39, which can be installed;
      │  ├─ pysolid [0.2.1|0.2.2|...|0.3.2] would require
      │  │  └─ python_abi 3.10.* *_cp310, which can be installed;
      │  ├─ pysolid [0.2.2|0.2.3] would require
      │  │  └─ python_abi 3.8 *_pypy38_pp73, which can be installed;
      │  ├─ pysolid [0.2.2|0.2.3] would require
      │  │  └─ python_abi 3.9 *_pypy39_pp73, which can be installed;
      │  ├─ pysolid [0.2.3|0.3.0|0.3.1|0.3.2] would require
      │  │  └─ python_abi 3.11.* *_cp311, which can be installed;
      │  ├─ pysolid [0.3.0|0.3.1] would require
      │  │  └─ pypy3.8 >=7.3.11 , which can be installed;
      │  ├─ pysolid [0.3.0|0.3.1] would require
      │  │  └─ pypy3.9 >=7.3.11 , which can be installed;
      │  └─ pysolid 0.3.2 would require
      │     └─ pypy3.9 >=7.3.13 , which can be installed;
      └─ python 3.12**  is not installable because there are no viable options
         ├─ python [3.12.0|3.12.1] would require
         │  └─ python_abi 3.12.* *_cp312, which conflicts with any installable versions previously reported;
         └─ python 3.12.0rc3 would require
            └─ _python_rc, which does not exist (perhaps a missing channel).
  critical libmamba Could not solve for environment specs
  Error: Failed to execute ["/home/runner/micromamba-bin/micromamba create -n topsapp_env -y --log-level warning \"python=3.12\" -f /home/runner/work/DockerizedTopsApp/DockerizedTopsApp/environment.yml"]: Error: The process '/home/runner/micromamba-bin/micromamba' failed with exit code 1
  /home/runner/work/_actions/mamba-org/provision-with-micromamba/v15/dist/main/index.js:[66](https://github.com/ACCESS-Cloud-Based-InSAR/DockerizedTopsApp/actions/runs/7745636931/job/21122092723#step:4:68)[69](https://github.com/ACCESS-Cloud-Based-InSAR/DockerizedTopsApp/actions/runs/7745636931/job/21122092723#step:4:71)3
cmarshak commented 7 months ago

It appears that with newer netcdf4-python versions, we are running into the error documented here within packaging (i.e. writing isce2 outputs to netcdf).

Here is the exact error message:

2024-02-01 13:19:59,748 - standard_product_packaging - INFO - dataset name = ionosphere
2024-02-01 13:19:59,749 - standard_product_packaging - ERROR - NetCDF: Filter error: bad id or parameters or duplicate filter
2024-02-01 13:19:59,763 - standard_product_packaging - ERROR - Traceback (most recent call last):
  File "/mnt/leffe-data2/cmarshak/DockerizedTopsApp/isce2_topsapp/packaging_utils/nc_packaging.py", line 601, in main
    create_group(fid, group, fid_parent=fid)
  File "/mnt/leffe-data2/cmarshak/DockerizedTopsApp/isce2_topsapp/packaging_utils/nc_packaging.py", line 96, in create_group
    create_group(grp_id,subgroup,fid_parent)
  File "/mnt/leffe-data2/cmarshak/DockerizedTopsApp/isce2_topsapp/packaging_utils/nc_packaging.py", line 96, in create_group
    create_group(grp_id,subgroup,fid_parent)
  File "/mnt/leffe-data2/cmarshak/DockerizedTopsApp/isce2_topsapp/packaging_utils/nc_packaging.py", line 96, in create_group
    create_group(grp_id,subgroup,fid_parent)
  [Previous line repeated 1 more time]
  File "/mnt/leffe-data2/cmarshak/DockerizedTopsApp/isce2_topsapp/packaging_utils/nc_packaging.py", line 93, in create_group
    create_dataset(grp_id,dataset,fid_parent)
  File "/mnt/leffe-data2/cmarshak/DockerizedTopsApp/isce2_topsapp/packaging_utils/nc_packaging.py", line 417, in create_dataset
    write_dataset(fid,data,properties_data)
  File "/mnt/leffe-data2/cmarshak/DockerizedTopsApp/isce2_topsapp/packaging_utils/nc_packaging.py", line 136, in write_dataset
    dset = fid.createVariable(properties_data.name, str, ('matchup',), zlib=True)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "src/netCDF4/_netCDF4.pyx", line 2945, in netCDF4._netCDF4.Dataset.createVariable
  File "src/netCDF4/_netCDF4.pyx", line 4184, in netCDF4._netCDF4.Variable.__init__
  File "src/netCDF4/_netCDF4.pyx", line 2014, in netCDF4._netCDF4._ensure_nc_success
RuntimeError: NetCDF: Filter error: bad id or parameters or duplicate filter

I got similar error messages related to the variables that were encoded as lists.

Fixing that in this part of @dbekaert's code fixed the issue.

cmarshak commented 7 months ago

@mgovorcin

It appears the 3.11 only affects ionosphere computation.

This looks very annoying.

Here is what I did: https://github.com/ACCESS-Cloud-Based-InSAR/DockerizedTopsapp-Debugging-NBs/blob/dev/2_Compare_GUNWs.ipynb

If I swap out the "new" GUNW with a GUNW with the SRTM water mask, the difference is much bigger:

##########
/science/grids/corrections/derived/ionosphere/ionosphere
##########
The variable /science/grids/corrections/derived/ionosphere/ionosphere has error:  21839.172
The origin of the new dataset is:  -71.180933661 48.669267199000004
The origin of the g(old)en dataset is:  -71.180933661 48.669267199000004

##########
/science/grids/corrections/derived/ionosphereBurstRamps/ionosphereBurstRamps
##########
The variable /science/grids/corrections/derived/ionosphereBurstRamps/ionosphereBurstRamps has error:  4483.2397
The origin of the new dataset is:  -71.180933661 48.669267199000004
The origin of the g(old)en dataset is:  -71.180933661 48.669267199000004

This will need your attention.

mgovorcin commented 7 months ago

@mgovorcin

It appears the 3.11 only affects ionosphere computation.

This looks very annoying.

Here is what I did: https://github.com/ACCESS-Cloud-Based-InSAR/DockerizedTopsapp-Debugging-NBs/blob/dev/2_Compare_GUNWs.ipynb

If I swap out the "new" GUNW with a GUNW with the SRTM water mask, the difference is much bigger:

##########
/science/grids/corrections/derived/ionosphere/ionosphere
##########
The variable /science/grids/corrections/derived/ionosphere/ionosphere has error:  21839.172
The origin of the new dataset is:  -71.180933661 48.669267199000004
The origin of the g(old)en dataset is:  -71.180933661 48.669267199000004

##########
/science/grids/corrections/derived/ionosphereBurstRamps/ionosphereBurstRamps
##########
The variable /science/grids/corrections/derived/ionosphereBurstRamps/ionosphereBurstRamps has error:  4483.2397
The origin of the new dataset is:  -71.180933661 48.669267199000004
The origin of the g(old)en dataset is:  -71.180933661 48.669267199000004

This will need your attention.

@cmarshak :

  1. is the "new" GUNW iono processed with SRTM water mask? Did you compare iono results with SRTM vs new mask to exclude that as source of this difference?
  2. I suggest to do comparison over the land as part of python 3.11 version testing, the example you picked is pretty tricky one (southern side of river is completely decorrelated) and different masking could affect the iono output
cmarshak commented 7 months ago

Marin - sorry - it's a wall of text that's hard to sift through. The best source of what I did is the notebook I shared above. Please carefully look at that.

is the "new" GUNW iono processed with SRTM water mask? Did you compare iono results with SRTM vs new mask to exclude that as source of this difference?

The last PR that was merged introduced the ESA water mask. I am comparing output from that PR and this one. Here is junky output:

##########
/science/grids/corrections/derived/ionosphere/ionosphere
##########
The variable /science/grids/corrections/derived/ionosphere/ionosphere has error:  197.23857
The origin of the new dataset is:  -71.180933661 48.669267199000004
The origin of the g(old)en dataset is:  -71.180933661 48.669267199000004

##########
/science/grids/corrections/derived/ionosphereBurstRamps/ionosphereBurstRamps
##########
The variable /science/grids/corrections/derived/ionosphereBurstRamps/ionosphereBurstRamps has error:  95.959854
The origin of the new dataset is:  -71.180933661 48.669267199000004
The origin of the g(old)en dataset is:  -71.180933661 48.669267199000004

Then, I shared some additional output comparing output from the previous PR and this one that still had the SRTM mask. Again, the links have the GUNWs, so that made provenance easier.

I suggest to do comparison over the land as part of python 3.11 version testing, the example you picked is pretty tricky one (southern side of river is completely decorrelated) and different masking could affect the iono output.

We can do such studies possibly after we merge. I would ask you to considering taking a lead on this - it would simply involve running the Production and Test jobs and comparing.

The point for this PR is that we want to clarify now is why when updating from 3.9 to 3.11 with the exact same inputs we get two different outputs? It could be gdal. It could be something else.

This might be a good starting point: https://github.com/isce-framework/isce2/compare/v2.6.1...v2.6.3

This is the difference between the previous ISCE2 version we were using and the one now. Looks like runIon.py has some changes...

Hope to talk soon.

cmarshak commented 7 months ago

@mgovorcin - it appears your code frequently calls runIon and doesn't reproduce it.

Not sure what the implications are in the update but this seems like a pretty annoying issue: https://github.com/ACCESS-Cloud-Based-InSAR/DockerizedTopsApp/blob/dev/isce2_topsapp/iono_proc.py

Appreciate your help with this.

jhkennedy commented 1 month ago

Just a note: PySolid does support Python 3.12 now.