Fix pickling - Githubissues

taldcroft commented 1 year ago

Description

It was noted by @jskrist that the guide_count attribute of an ACAReviewTable object was getting lost when using this object with parallel processing. The root cause was the same lost attribute when the object was pickled and unpickled.

This PR fixes that particular problem along with a couple of other similar problems that would have been an issue.

Also:

Adds tests for pickling and copying that were copied almost verbatim from proseco.
__init__() function was slightly re-organized for clarity.
is_OR() property was updated to use a more standard idiom so that the private self._is_OR attribute does not require explicit initialization.

Note:

Sparkles master (and this branch) currently require dev versions of packages, at least chandra_aca and cheta to pass tests.

Interface impacts

None

Testing

Unit tests

[ ] No unit tests
[x] Mac
[ ] Linux
[ ] Windows

Independent check of unit tests by [REVIEWER NAME]

[ ] [PLATFORM]:

Functional tests

From @jskrist :

I also checked out the branch on my Windows machine, and manually ran the checks in the test_pickle() function (on an independent ACACatalog), as the tests don't run correctly in my Windows environment. Additionally, I ran the code I wrote to parallelize the run_aca_review() step with the new changes and can confirm that with this fix, the multiprocessing code is now returning expected results instead of erroring.

taldcroft commented 1 year ago

@jskrist - I checked the multiprocessing docs and by default is uses pickle with the default protocol, which is currently 4 (see https://docs.python.org/3/library/pickle.html#data-stream-format). Is that what you mean by the binary version?

jskrist commented 1 year ago

@jskrist - I checked the multiprocessing docs and by default is uses pickle with the default protocol, which is currently 4 (see https://docs.python.org/3/library/pickle.html#data-stream-format). Is that what you mean by the binary version?

Thanks for looking into this. My question was even simpler than that. essentially, is there any meaningful difference between pickle.dumps and pickle.dump for this test? Do we have confidence that using loads(dumps()) is equivalent to using load(dump())?

taldcroft commented 1 year ago

@jskrist - yes, dumps() is equivalent to dump() : https://docs.python.org/3/library/pickle.html#pickle.dumps : "Return the pickled representation of the object obj as a bytes object, instead of writing it to a file."

taldcroft commented 1 year ago

@jskrist - and thanks for the functional testing, I put that into the top description.

sot / sparkles

Fix pickling #186

Description

Interface impacts

Testing

Unit tests

Functional tests