uqfoundation / dill

serialize all of Python
http://dill.rtfd.io
Other
2.24k stars 178 forks source link

dumps results are not reproducible, roundtrip equality checking fails with closures #481

Open AaronFriel opened 2 years ago

AaronFriel commented 2 years ago

Summary

The Pulumi Python library uses dill to serialize "dynamic providers" to state, and relied upon a deterministic serialization that allowed us to do a comparison of the bytes and determine if two pickled classes were the same or different.

It looks like something between 0.3.4 and 0.3.5.1 changed the serialization and two of our unit tests began failing.

Resolution

I'd appreciate any information on how to continue tracking dill upstream and return to deterministic behavior. I think it may be due to a closure capturing of the provider we're testing:

https://github.com/pulumi/pulumi/blob/48f43906f4af8bd94029737cc4b64e20d8b15a14/tests/integration/enums/python/__main__.py#L18-L25

As well, the topic of supply chain security might be relevant. Non-reproducible "pickle" results mean users cannot rely on hashes or proof-by-construction (equality checking) to verify that a serialized Python class/function is the same as another. It might be helpful to have a reproducible mode or subset in dill.

Reproduction

Expected behavior

Pickling the same class in two runs of the Python program should return the same code each time.

Actual behavior

We see that the base64-encoded pickling is different:

First run:

gASVLgIAAAAAAACMCmRpbGwuX2RpbGyUjAxfY3JlYXRlX3R5cGWUk5QoaACMCl9sb2FkX3R5cGWUk5SM BHR5cGWUhZRSlIwNUGxhbnRQcm92aWRlcpSMFnB1bHVtaS5keW5hbWljLmR5bmFtaWOUjBBSZXNvdXJj ZVByb3ZpZGVylJOUhZR9lCiMB19fZG9jX1+UTowKX19tb2R1bGVfX5SMCF9fbWFpbl9flIwNX19zbG90 bmFtZXNfX5RdlIwGY3JlYXRllGgAjBBfY3JlYXRlX2Z1bmN0aW9ulJOUKGgAjAxfY3JlYXRlX2NvZGWU k5QoSwJLAEsASwJLA0tDQxZ0AGQBNwBhAHQBdAJ0AIMBfAGDAlMAlE5LAYaUjApjdXJyZW50X2lklIwM Q3JlYXRlUmVzdWx0lIwDc3RylIeUjARzZWxmlIwGaW5wdXRzlIaUjDcvdG1wL3AtaXQtemluZy1weXRo b24tNzc3ODVmMzYtMjE1MjQzNjM0MS8uL19fbWFpbl9fLnB5lGgTSxZDBAACCAGUKSl0lFKUfZSMCF9f bmFtZV9flGgQc2gTTk50lFKUfZR9lCiMD19fYW5ub3RhdGlvbnNfX5R9lIwMX19xdWFsbmFtZV9flIwU UGxhbnRQcm92aWRlci5jcmVhdGWUdYaUYmglKIwKY3VycmVudF9pZJRLAIwMQ3JlYXRlUmVzdWx0lGgJ aBuTlIwDc3RylGgEaByFlFKUdTB1dJRSlCmBlC4=

Second run:

gASVLgIAAAAAAACMCmRpbGwuX2RpbGyUjAxfY3JlYXRlX3R5cGWUk5QoaACMCl9sb2FkX3R5cGWUk5SM BHR5cGWUhZRSlIwNUGxhbnRQcm92aWRlcpSMFnB1bHVtaS5keW5hbWljLmR5bmFtaWOUjBBSZXNvdXJj ZVByb3ZpZGVylJOUhZR9lCiMB19fZG9jX1+UTowKX19tb2R1bGVfX5SMCF9fbWFpbl9flIwNX19zbG90 bmFtZXNfX5RdlIwGY3JlYXRllGgAjBBfY3JlYXRlX2Z1bmN0aW9ulJOUKGgAjAxfY3JlYXRlX2NvZGWU k5QoSwJLAEsASwJLA0tDQxZ0AGQBNwBhAHQBdAJ0AIMBfAGDAlMAlE5LAYaUjApjdXJyZW50X2lklIwM Q3JlYXRlUmVzdWx0lIwDc3RylIeUjARzZWxmlIwGaW5wdXRzlIaUjDcvdG1wL3AtaXQtemluZy1weXRo b24tNzc3ODVmMzYtMjE1MjQzNjM0MS8uL19fbWFpbl9fLnB5lGgTSxZDBAACCAGUKSl0lFKUfZSMCF9f bmFtZV9flGgQc2gTTk50lFKUfZR9lCiMD19fYW5ub3RhdGlvbnNfX5R9lIwMX19xdWFsbmFtZV9flIwU UGxhbnRQcm92aWRlci5jcmVhdGWUdYaUYmglKIwMQ3JlYXRlUmVzdWx0lGgJaBuTlIwDc3RylGgEaByF lFKUjApjdXJyZW50X2lklEsAdTB1dJRSlCmBlC4=

Background

In the Pulumi infrastructure as code tool we have a concept of "dynamic" resource providers, wherein the user can declare some code in our SDKs such as Python as code. This allows users to create custom wrappers around cloud infrastructure, RESTful APIs, etc, which are then managed as resources. To enable our engine to determine when the resource needs to be changed, we calculate the diff and "ask" the provider if the state should remain as-is, be updated in place, or replaced (recreated and deleted).

We do have a wrapper around dill & pickle that ensure dictionary keys are serialized in a deterministic order, but it appears that something changed such that our wrapper is no longer reproducible across multiple runs of the program. Our wrapper is below:

https://github.com/pulumi/pulumi/blob/06ba63bb57e90706c1550861b785075ae860144a/sdk/python/lib/pulumi/dynamic/dynamic.py#L212-L239

anivegesana commented 2 years ago

I can see the difference in the two pickles: https://www.diffchecker.com/ivKmWJ8A

It seems to be that the _setitems I used in order to pickle functions that share the same global dictionary is non-deterministic. These are the lines in the code that circumvent your save_dict_sorted function.

https://github.com/uqfoundation/dill/blob/78e3d55e0407eeb127cd3922f32b97e82ed56eaa/dill/_dill.py#L1968-L1981

I am currently working on a deterministic flag for dill that will fix the problem. Follow #19 for progress.

For now, downgrade to dill 0.3.4.

anivegesana commented 2 years ago

I think a solution that will not require me rushing to get this to work but will not break compatibility with other versions of dill or your library, is to overwrite pickle.Pickler. _batch_setitems to sort the items.

    unsorted_batch_setitems = pickle.Pickler._batch_setitems

    def _batch_setitems(self, items):
        unsorted_batch_setitems(sorted(items))

    pickle.Pickler._batch_setitems = _batch_setitems

I hope this is able to fix the problem in pulumi's case while I work on the more general deterministic flag.

AaronFriel commented 2 years ago

That's fantastic! A deterministic mode is exactly what we'd want to use.

We're going to implement a workaround for now and I'll keep an eye on this for determinism. Should I watch this issue or another?

anivegesana commented 2 years ago

I'll post a link to my PR here and in the other issue when I am done. If you want to proactively support it when it comes out, you can add the following line to your code:

dill.settings['deterministic'] = True

I think the choice of name for the setting is not likely to change, but it could, so I'll inform you if it does. The deterministic mode will also have other side effects like dropping names for unnamed functions, like lambdas or functions in IPython notebooks. I'll describe what all of the side effects are in the PR. Continue using the workaround instead of the deterministic mode if those additional side effects are not desired.

mmckerns commented 2 years ago

@anivegesana: where's your PR for deterministic mode? I think that needs some major discussion. Deterministic hashes from pickles are super useful, and we need to be careful that changes to dill aren't moving toward generating less deterministic pickles, unless there's a very good reason. I expect the recent changes have broken all sorts of hashing codes.

anivegesana commented 2 years ago

Don't have any code yet, but am currently discussing it with @davips on #19. The plan is to make set, frozenset, dict, and code deterministic and everything else should be deterministic if that happens.

AaronFriel commented 2 years ago

@mmckerns fwiw, Pickle already behaved in a non-deterministic way that we had to work around by modifying the global pickle object.

If a deterministic flag did that for consumers of Pickle via dill, I think that would be a popular default.