gilch / hissp

It's Python with a Lissp.
https://gitter.im/hissp-lang/community
Apache License 2.0
364 stars 9 forks source link

Dill integration #250

Closed gilch closed 7 months ago

gilch commented 7 months ago

Hissp falls back to pickle when it needs to compile an object Python has no literal notation for, which covers a broad range of objects in a very general way. This is part of Hissp's commitment to homoiconicity. However, there are object types even pickle can't handle.

But most of those can be handled by dill. Unfortunately, pickle is hardcoded in the compiler. I did consider dill when writing the compiler, but that would violate the no-dependency rule, a major project goal of Hissp. That's not going to change. I figured anyone willing to add dependencies who needed it could simply subclass the compiler and override the pickle method. Like most of the compiler, it's a reasonably small method, so this is quite doable.

However, dill was designed as a drop-in replacement for pickle and it's actually much easier to monkeypatch it in. It's a two-liner in Lissp:

(attach hissp.compiler. : pickle dill.)
(attach pickle. dill..loads)

Monkeypatching is notoriously brittle, because it almost by definition depends on implementation details. I control what's considered implementation and what's considered interface in Hissp, so I can totally document the first line as supported. The second line is a potential problem, however. That's monkeypatching the standard library. This might be OK, since dill can load everything pickle can. It's unlikely to break anything else depending on pickle. But it might.

If there was a way to make Hissp emit __import__('dill').loads( instead of __import__('pickle').loads(, then you wouldn't have to monkeypatch the pickle module. It's currently hardcoded in an f-string. You can, of course, monkeypatch the whole method, but that's not nearly as easy as the two-liner.

gilch commented 7 months ago

The simplest approach would be to factor out the pickle into a constant. That would allow monkeypatching to work on the hissp package alone without rewriting the whole method. You still have to patch two places though.

The most correct might be to factor it out into a ContextVar, which might be the closest thing the Python standard library has to dynamic vars. It's arguably not a monkeypatch at that point. But do we really need that? Would we ever want to use dill and pickle in the same session? Maybe, Hissp is not necessarily run where it is compiled. I don't think this buys us much over a constant though. If you need that degree of control, unittest.mock.patch can swap the context for you as long as you're not threading or doing async shenanigans.

The above is arguably still a monkeypatch because it still needs the pickle global to be replaced. Ideally, that would read from the same ContextVar instead, but that would be a bit awkward in any place the compiler is using pickle, which brings us to the third option: infer the import name pickle from the .__name__ attr of the pickle module. This arguably fixes a DRY violation, where the same bit of information is stored in another place, although one too minor to normally bother with considering its greater overhead. However, this would make the first attach line all you need to monkeypatch to enable dill. Let's go with that for now.