lordmauve / chopsticks

Chopsticks is an orchestration library: it lets you execute Python code on remote hosts over SSH.
https://chopsticks.readthedocs.io/
Apache License 2.0
158 stars 16 forks source link

Use binary encoding for all data transfer #32

Open lordmauve opened 7 years ago

lordmauve commented 7 years ago

In 21722bafbb8470edfdd24a4cca50a1666eb72386 a binary encoding was added, and which is used for sending structured data from the host to the client. The motivation there was to avoid costly base64-in-JSON encoding and decoding for binary data, which is amplified when tunnelling because it would otherwise be performed at each hop.

However, our own encoding gives us the opportunity to safely support all Python primitive types, and not just be limited to JSON. The current encoding can already distinguish between list and tuples, bytes and strings. However, lots of interesting datastructures are precluded by being limited to JSON - frozenset-keyed dicts, for example! Meanwhile, we do not get the human-readable benefit of JSON, as it is very hard to inspect messages being passed already.

One problem with this proposal is that the encoding code will need to be present in both the orchestration host and the bubble. We ought to attempt to achieve this without copying and pasting. If we put it in a separate file we may be able to simply prepend it to to the bubble.py code.

We should also profile the encoding in comparison to JSON to avoid a possible performance regression.