Open reporter4u opened 2 years ago
Hi, I have added this to my triage queue. I'll need to dive into the code and understand what is happening in that part of the code.
Thanks for providing the example and the stack trace. That is super helpful.
I've just run into this issue myself (with the context of using a file rather than BytesIO
). The issue is that run
expects out_stream
(and err_stream
) to be text streams rather than bytes streams. I think that the example above would work if you used StringIO
rather than BytesIO
.
I've just run into this issue myself (with the context of using a file rather than
BytesIO
). The issue is thatrun
expectsout_stream
(anderr_stream
) to be text streams rather than bytes streams. I think that the example above would work if you usedStringIO
rather thanBytesIO
.
I tried to use StringIO
before BytesIO
and it didn't work anyway, both with asynchronous set to True or False (I need asynchronous
to True
).
You can try this replacing the line out = io.BytesIO()
with out = io.StringIO()
as you'll see it won't print anything.
Opening this issue I posted the example with io.BytesIO due to the error TypeError: a bytes-like object is required, not 'str'
.
AFAIK both {out,err}_stream as well as sys.stdout are file-like object.
This is my environment: SO: SMP Debian 5.10.113-1 Distribution: Debian 11 Python: 3.9.2 Invoke: 1.6.0
Can you try it in your python environment?
PS. Likely it doesn't mean anything... I tried also to change the parameters warn
and pty
without any positive effects.
I've just looked at this a bit more, and the issue is that calling readline
on the StringIO
won't work as expected because writing to the stream will move the stream position forward (so calling readline
won't return anything). For example:
In [1]: import io
In [2]: buffer = io.StringIO()
In [3]: buffer.write("test")
Out[3]: 4
In [4]: buffer.readline()
Out[4]: ''
I'm not sure how to read from the same stream as you are writing it, but from a quick google this thread has a few options.
In your example you should use the method seek()
before readline()
:
Python 3.9.2 (default, Feb 28 2021, 17:03:44)
[GCC 10.2.1 20210110] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import io
>>> output = io.StringIO()
>>> output.write('First line.\n')
12
>>> print('Second line.', file=output)
>>> output.seek(0)
0
>>> output.readline()
'First line.\n'
>>> output.readline()
'Second line.\n'
I'm not sure how to read from the same stream as you are writing it
It is exactly what I need.
Despite StringIO and BytesIO, my main goal is to have a sort of in memory buffer where to read into line by line while run
with asynchronous=True
is writing into it, since my process lasts a lot of time even for hours.
With io.StreamIO using asynchronous=True
it seems run
stops to write into out_stream
when the script start to read in it . This is another example with my script test.py inside a folder with other 4 dummy files:
import io
import time
from invoke import run
out = io.StringIO()
run('echo "foo" && ls -la && sleep 3 && echo "bar" && ls -la', warn=True, pty=True, out_stream=out, asynchronous=True)
time.sleep(1)
out.seek(0)
for i in range(1,19):
myline=out.readline()
print(f'line {i}: {myline}')
print('\n\n\n#out.getvalue() content')
print(out.getvalue())
out.close()
And as you can see the command output is incomplete after for cycle as well as in the print(out.getvalue())
call. This is what it prints out:
line 1: foo
line 2: totale 12
line 3: drwxr-xr-x 2 test test 4096 11 mag 11.47 .
line 4: drwxr-xr-x 15 test test 4096 11 mag 07.54 ..
line 5: -rw-r--r-- 1 test test 0 11 mag 07.55 file1
line 6: -rw-r--r-- 1 test test 0 11 mag 07.55 file2
line 7: -rw-r--r-- 1 test test 0 11 mag 07.55 file3
line 8: -rw-r--r-- 1 test test 0 11 mag 07.55 file4
line 9: -rw-r--r-- 1 test test 385 11 mag 11.47 test.py
line 10:
line 11:
line 12:
line 13:
line 14:
line 15:
line 16:
line 17:
line 18:
#out.getvalue() content
foo
totale 12
drwxr-xr-x 2 test test 4096 11 mag 11.47 .
drwxr-xr-x 15 test test 4096 11 mag 07.54 ..
-rw-r--r-- 1 test test 0 11 mag 07.55 file1
-rw-r--r-- 1 test test 0 11 mag 07.55 file2
-rw-r--r-- 1 test test 0 11 mag 07.55 file3
-rw-r--r-- 1 test test 0 11 mag 07.55 file4
-rw-r--r-- 1 test test 385 11 mag 11.47 test.py
Now, if you increase the delay time.sleep(15)
it prints out the complete command output, likely because the command terminate before sleep expires and then before to start reading in out_stream
. This is the output:
line 1: foo
line 2: totale 12
line 3: drwxr-xr-x 2 test test 4096 11 mag 11.51 .
line 4: drwxr-xr-x 15 test test 4096 11 mag 07.54 ..
line 5: -rw-r--r-- 1 test test 0 11 mag 07.55 file1
line 6: -rw-r--r-- 1 test test 0 11 mag 07.55 file2
line 7: -rw-r--r-- 1 test test 0 11 mag 07.55 file3
line 8: -rw-r--r-- 1 test test 0 11 mag 07.55 file4
line 9: -rw-r--r-- 1 test test 385 11 mag 11.51 test.py
line 10: bar
line 11: totale 12
line 12: drwxr-xr-x 2 test test 4096 11 mag 11.51 .
line 13: drwxr-xr-x 15 test test 4096 11 mag 07.54 ..
line 14: -rw-r--r-- 1 test test 0 11 mag 07.55 file1
line 15: -rw-r--r-- 1 test test 0 11 mag 07.55 file2
line 16: -rw-r--r-- 1 test test 0 11 mag 07.55 file3
line 17: -rw-r--r-- 1 test test 0 11 mag 07.55 file4
line 18: -rw-r--r-- 1 test test 385 11 mag 11.51 test.py
#out.getvalue() content
foo
totale 12
drwxr-xr-x 2 test test 4096 11 mag 11.51 .
drwxr-xr-x 15 test test 4096 11 mag 07.54 ..
-rw-r--r-- 1 test test 0 11 mag 07.55 file1
-rw-r--r-- 1 test test 0 11 mag 07.55 file2
-rw-r--r-- 1 test test 0 11 mag 07.55 file3
-rw-r--r-- 1 test test 0 11 mag 07.55 file4
-rw-r--r-- 1 test test 385 11 mag 11.51 test.py
bar
totale 12
drwxr-xr-x 2 test test 4096 11 mag 11.51 .
drwxr-xr-x 15 test test 4096 11 mag 07.54 ..
-rw-r--r-- 1 test test 0 11 mag 07.55 file1
-rw-r--r-- 1 test test 0 11 mag 07.55 file2
-rw-r--r-- 1 test test 0 11 mag 07.55 file3
-rw-r--r-- 1 test test 0 11 mag 07.55 file4
-rw-r--r-- 1 test test 385 11 mag 11.51 test.py
Hi, I'm trying to capture and process in real time the output of a command started with
run
in asynchronous mode, with the out_stream option using an in-memory buffer. So I've used the built-inio
library with BytesIO class: is this class supported in your Invoke implementation?I post the following code example:
If you run this code, as you can see, it won't print any binary data despite
readline()
method is supported by BytesIO class. It seemsrun
method is not writing in theout
binary (buffer) object. Furthermore if you put asynchronous to False the script exits with this error:AFAIK io.BytesIO() is not a 'str' object, it's a binary stream using an in-memory bytes buffer.
Is it possible to use a BytesIO type (for binary data) as a buffer container of {out,err}_stream? If not, is it possible to implement it in Invoke, in order to process the output stream in real time (not when the command, that is very long in my project, has finished) without using filesystem (file-like) objects?
Thank you in advance for your help!
Roberto