spacetelescope / stpipe

https://stpipe.readthedocs.io
Other
3 stars 25 forks source link

conditionally format log_records #171

Closed braingram closed 3 months ago

braingram commented 3 months ago

Instead of storing every log message as a record, only store the formatted logs (if a formatter was provided). A formatter can be supplied by setting _log_records_formatter on the Step class/instance.

This prevents the Step._log_records from keeping references to args provided for logging calls. An example of a particularly problematic one is: https://github.com/spacetelescope/stpipe/blob/9952c365458fc3f01839a912b62ab6c637dc9179/src/stpipe/step.py#L429 This will (with stpipe main) store a reference to the input datamodel within the log record. This creates a chain of references that prevents garbage collecting args (and the input datamodel) because:

This means (on main) any datamodel used as an input to a Step will be kept in memory until the Pipeline is garbage collected.

This PR fixes the issue by formatting the LogRecord to a string before storing it in _log_records. This changes the API for Step.log_records so it now returns a list of strings (instead of a list of LogRecord instances). The only known user of log_records is romancal (which will require a minor update for this PR, more on that below).

The romancal CI failure here is expected as a minor change will be needed in romancal (see https://github.com/spacetelescope/romancal/pull/1327).

Fixes https://github.com/spacetelescope/stpipe/issues/169

jwst regtests with this PR all pass: https://plwishmaster.stsci.edu:8081/blue/organizations/jenkins/RT%2FJWST-Developers-Pull-Requests/detail/JWST-Developers-Pull-Requests/1616/pipeline/194

romancal regtests will require https://github.com/spacetelescope/romancal/pull/1327 to pass but pass with that PR: https://github.com/spacetelescope/RegressionTests/actions/runs/10063338609/job/27821316132

Plan for merging

Since romancal currently depends on stpipe main the merging of this PR and the romancal PR https://github.com/spacetelescope/romancal/pull/1327 will require some coordination. One possible option is to update the romancal PR to remove the dependency to my branch and then merge this stpipe PR then immediately merge the romancal PR.

I've opened both PRs for review but please hold off merging until both PRs are ready to go.

codecov[bot] commented 3 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 73.27%. Comparing base (89e8262) to head (a81a412). Report is 34 commits behind head on main.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #171 +/- ## ========================================== + Coverage 73.20% 73.27% +0.06% ========================================== Files 25 25 Lines 1911 1916 +5 ========================================== + Hits 1399 1404 +5 Misses 512 512 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

jdavies-st commented 3 months ago

This PR is good. Perhaps for a future PR, I would go one step further and get rid of any Step attributes that point to the log. There's no reason to store the log as an attribute in a class, as logs are global singletons. This is an anti-pattern to avoid:

https://docs.python.org/3/howto/logging-cookbook.html#using-loggers-as-attributes-in-a-class-or-passing-them-as-parameters

So really, there should not be a self.log in Step, nor should there be self.log_records at all.