Use Child process - Githubissues

This branch uses a node child process to manage python services.

It builds on the websocket stuff: so you can either call a service via POST (in which case you'll get the result but not the logs) or via Websocket (CLI) - in which case you get logs and the result.

The server always logs all statements to stdout and to std error (error reporting is pretty good I think). But if a websocket is connected then print statements do not get sent through - so print can be reserved for server logging stuff.

I may later decide to ignore debug logs, so that we can still use logger.debug on the server. Or I'll add a special logger or something.

This ensures that every python file runs in isolation. Over the last week I've come to realise that this is super important - I really, really don't want python devs to worry about whether credentials, state, clients or loggers leak between runs. The only way to ensure this is to run pure environment each time, which basically means spawning a python child process directly.

This approach is right "on the metal", calling out to poetry run directly to execute our python scripts.

The result of the job is written to a file from python, and read back in from bun to be sent back. The file is deleted upon completion.

Closes #55 #51

Issues

Some of these are aesthetic, but still:

I have to use node child process, not bun child process, to get line-by-line logging. It doesn't really matter but it bugs me. There should also be a better way to stream the logs from bun really
[Done - simple regex filter will only send logger. lines] ~All python logs get sent back to the CLI. In the older implementation, using logger.info in python will go back to apollo, but using print will only go to stdout. I quite liked this divide, it means I can log some stuff for system devs and some stuff for end users. I suppose we could still do that by namespace/prefixing logs and ignoring system logs~
[all sorted] ~entry is now called from main trough the server. I've got two problems with this~
- We're sending JSON through the command argument, which is kinda wild. There must be a limit to how much data we send through stdin, and I suspect that big openapi payloads will blow it
- The server and direct python calls both invoke entry.py through __main__, but they send input data differnet (The py command uses a file path). This difference is really irritating.
- I suppose both issues are reconciled if the server dumps input to file and then python reads it back in

OpenFn / apollo

Use Child process #61

Issues