geoschem / integrated_methane_inversion

Integrated Methane Inversion workflow repository.
https://imi.readthedocs.org
MIT License
26 stars 22 forks source link

Remove `calc_sensi.py` and calculate sensitivities on XCH4 #252

Closed eastjames closed 3 weeks ago

eastjames commented 1 month ago

Name and Institution (Required)

Name: James East Institution: Harvard ACMG

Describe the update

This PR removes calc_sensi.py. The Jacobian calculation is moved completely to the operator and activated if build_jacobian == True. The operator is applied to the Jacobian runs, then sensitivities and the K matrix is calculated from XCH4 (columns) instead of CH4 (3D).

This results in a fairly large speed up in the inversion step since the I/O bottleneck in calc_sensi is skipped. For large state vectors, giving large memory and/or balancing cores with memory may be required.

This has been tested for the out-of-box case (optBC on, optOH off, normal errors), lognormal errors, optOH, and the Kalman filter out of box. For the Kalman filter, I get this error which I may need help with. @djvaron @laestrada @megan-he have you encountered this?

Traceback (most recent call last):
  File "/n/holylfs05/LABS/jacob_lab/Users/jeast/proj/imi/bugfixes/20240703_2dsensitivities/test_kf/integrated_methane_inversion/src/components/kalman_component/prepare_sf.py", line 176, in <module>
    prepare_sf(config_path, period_number, base_directory, nudge_factor)
  File "/n/holylfs05/LABS/jacob_lab/Users/jeast/proj/imi/bugfixes/20240703_2dsensitivities/test_kf/integrated_methane_inversion/src/components/kalman_component/prepare_sf.py", line 119, in prepare_sf
    lambda_scaler = current_total / nudged_total
                    ~~~~~~~~~~~~~~^~~~~~~~~~~~~~
ZeroDivisionError: float division by zero

All other cases are working and results match dev (for the out of box case, I haven't compared the others).

eastjames commented 1 month ago

Error mentioned in the initial PR is now fixed and this is ready to go once approved. It is working for Permian test case for normal errors, lognormal errors, lognormal errors+OH, and Kalman filter+normal errors.

eastjames commented 1 month ago

I changed all complevels from 9 to 1. 1 turns on compression, but is faster than 9. This will decrease I/O time throughout but still use compression to write files. https://unidata.github.io/netcdf4-python/#netCDF4.Dataset.createVariable

eastjames commented 1 month ago

This PR addresses #233

eastjames commented 1 month ago

Thanks for the review! Made the changes @laestrada, ready to go