IDEO-coLAB / pyfi

Quickly make Python functions available in node, or to a frontend client with pyfi-client: https://github.com/ideo-colab/pyfi-client
MIT License
3 stars 0 forks source link

Pyfi fails when handling strings that are too long #3

Open pswoodworth opened 6 years ago

pswoodworth commented 6 years ago

The python JSON parser appears to be running into issues with strings that are too long inside of python. Needs some further investigation.

pswoodworth commented 6 years ago

On further investigation, this seems to be an issue with how the OS handles buffering stdout pipes. When the stdout buffer overflows it flushes the buffer to the pipe, which results in arbitrarily splitting at the exact byte where it hits overflow. The result is that it will either break when python/js tries to convert to a string or when parsing JSON from the string, depending on where the string was split (ie, if it was at the end of a character or in the middle of a character).

There doesn't seem to be a straightforward way to increase the buffer size at the OS level, so we may be stuck with solving it in code.

Options there seem to be:

1) Split long strings before piping them across, and indicate somehow that the string has been split. This has the advantage that it should be pretty explicit, but the disadvantage that we'll essentially be guessing about at what point the error occurs – ie if the buffer size changes across systems or dynamically on a single system we could still end up hosed. 2) Try to create our own buffering inside js + python. This has the advantage that it should work regardless of system conditions – the system will flush the buffer whenever it wants to, and our code can handle it, but it has the disadvantage that since neither the sender nor the receiver knows for sure where a bytestring was split or will be split the receiver will essentially just need to assume that data is arriving in the order it was sent in, which is potentially a pretty brittle assumption, particularly given that we're multithreading python operations.