spacetelescope / stdatamodels

https://stdatamodels.readthedocs.io
Other
5 stars 25 forks source link

Free references to overwritten attributes #301

Closed braingram closed 5 months ago

braingram commented 5 months ago

While looking into resample memory usage I found a way that repeated assignments to a datamodel attribute can result in accumulation of memory (until the model is written). An example is as follows:

import jwst.datamodels as dm
import numpy as np
m = dm.ImageModel()
sci = m.data
m.data = np.zeros_like(sci)  # assign a "zeros" array to data
m.data = np.ones_like(sci)   # overwrite the "zeros" with a new "ones" array

Running the example the "zeros" array should be garbage collectable by python after the data is overwritten with the "ones" array. However this is not the case on the current main branch with the latest asdf and instead both the "zeros" and "ones" arrays will persist in memory until the model is written to disk (or freed). This is due to the assignment validation calling custom_tree_to_tagged_tree with the DataModel._asdf AsdfFile instance: https://github.com/spacetelescope/stdatamodels/blob/4dfbff0bfb4e507502f793b4d199ec147f483d1c/src/stdatamodels/validate.py#L148 To validate the converted value, asdf has to generate a tree that contains a valid block number. This is accomplished by "queuing" the blocks to be written (even though in this case they are not immediately written, the order is needed). For normal file writes (where validation occurs) asdf uses a "options_context" to manage this write queue: https://github.com/asdf-format/asdf/blob/d46fb6b3e327dfd4318891cf2c1cf39c5dd36222/asdf/_asdf.py#L614 This PR adds a "options_context" to similarly manage the queue which prevents holding onto the overwritten array (a test is added to confirm the behavior). The "options_context" is not currently part of the asdf "public" API (but perhaps it should be considered for inclusion or changes to custom_tree_to_tagged_tree considered).

Regression tests: https://plwishmaster.stsci.edu:8081/job/RT/job/JWST-Developers-Pull-Requests/1426/ show only unrelated failures (same as main: https://plwishmaster.stsci.edu:8081/blue/organizations/jenkins/RT%2FJWST/detail/JWST/2877/tests)

Checklist

nden commented 5 months ago

Needs a rebase

codecov[bot] commented 5 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 66.27%. Comparing base (4d7c3a6) to head (5034026). Report is 26 commits behind head on main.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #301 +/- ## ========================================== + Coverage 64.84% 66.27% +1.43% ========================================== Files 103 101 -2 Lines 5694 5462 -232 ========================================== - Hits 3692 3620 -72 + Misses 2002 1842 -160 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.