Allow `capture_all` to be passed to `stream()`

craigwalton-dsit commented 2 hours ago

What is the feature and why do you need it: I'd like to be able to set capture_all=False when making a stream(client.connect_get_namespaced_pod_exec, ...) call. capture_all is a parameter which your WSClient initialiser accepts (source).

I sometimes stream large amounts of data over stdout, something like this:

ws_client = stream(
    client.connect_get_namespaced_pod_exec,
    name="my-pod",
    namespace="my-namespace",
    command=["cat", "my-file.txt"],
    stdin=False,
    stdout=True,
    stderr=True,
    binary=True,
    _preload_content=False,
    _request_timeout=60,
)
while ws_client.is_open():
    ws_client.update(timeout=1)
    if ws_client.peek_stdout():
        frame = ws_client.read_stdout()
        # write frame to file-like object

I want to avoid storing all the stdout in memory at any given point. At present, unless I do ws_client.read_all() inside my loop, all the stdout+stderr builds up in WSClient._all.

Describe the solution you'd like to see: The easiest approach looks to be: adapt the _websocket_request() function (source) to pop() capture_all from kwargs like is already done with binary.

Apologies if I'm missing a trick here and thanks for the invaluable library!

ofrzeta commented 2 hours ago

Hi, does neither of the approaches here work for you? https://github.com/kubernetes-client/python/blob/master/examples/pod_exec.py You can either exec with _preload_content=False or continuously read from the stream response.

craigwalton-dsit commented 2 hours ago

Hi @ofrzeta, thanks so much for your quick response!

Those examples have indeed been very helpful. However, I have already set _preload_content=False (as above) and am continuously reading from stdout, which does keep the memory footprint of WSClient._channels[STDOUT_CHANNEL] small.

My only niggle is that all this stdout data builds up in WSClient._all and won't be drained unless I call WSClient.read_all() (and just discard the output) inside my loop, which I'll admit is very simple workaround. I was hoping that an even more elegant solution would be to create the WSClient with capture_all=False and I wouldn't need to worry about draining WSClient._all.

Apologies if I've misunderstood what you were getting at!

craigwalton-dsit commented 1 hour ago

Correction: Calling WSClient.read_all() inside my loop would result in WSClient._channels[ERROR_CHANNEL] being cleared (which will contain the returncode), meaning I couldn't use .returncode.

kubernetes-client / python

Allow `capture_all` to be passed to `stream()` #2302