Closed ron003 closed 3 years ago
Hi @ron003, Thanks for looking into this. Indeed a python script seems the easiest way to achieve the behaviour we want and be independent by any system tool.
I quickly googled around how people re-implement tee
in python, if they have to. This stackoverflow answer looks interesting:
class RunOutput():
def __init__(self, returncode, stdout, stderr):
self.returncode = returncode
self.stdout = stdout
self.stderr = stderr
async def _read_stream(stream, callback):
while True:
line = await stream.readline()
if line:
callback(line)
else:
break
async def _stream_subprocess(cmd, stdin=None, quiet=False, echo=False) -> RunOutput:
if isWindows():
platform_settings = {'env': os.environ}
else:
platform_settings = {'executable': '/bin/bash'}
if echo:
print(cmd)
p = await asyncio.create_subprocess_shell(cmd,
stdin=stdin,
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.PIPE,
**platform_settings)
out = []
err = []
def tee(line, sink, pipe, label=""):
line = line.decode('utf-8').rstrip()
sink.append(line)
if not quiet:
print(label, line, file=pipe)
await asyncio.wait([
_read_stream(p.stdout, lambda l: tee(l, out, sys.stdout)),
_read_stream(p.stderr, lambda l: tee(l, err, sys.stderr, label="ERR:")),
])
return RunOutput(await p.wait(), out, err)
def run(cmd, stdin=None, quiet=False, echo=False) -> RunOutput:
loop = asyncio.get_event_loop()
result = loop.run_until_complete(
_stream_subprocess(cmd, stdin=stdin, quiet=quiet, echo=echo)
)
return result
I don't think we should worry about tee. It's in the coreutils
rpm (along with other utilities used: cat, date, dirname, mv, readlink, and wc). And tee does such a focused thing. For the specific cmake/ninja issue, I think we should break the issue into 3 pieces (pty-filter-tee):
I think the next focus should be on the filter. When cmake/ninja is doing color, it is also doing '\r' thing and we should decide exactly what filtering we want to do. For sure emitting lines as they are ready. To test how a filter behaves with buffering (i.e timing), '\r' and '\n' I do the following:
(sleep 3;echo `date +%s`;\
sleep 3;echo `date +%s`;\
sleep 3;echo -ne "`date +%s`\r";\
sleep 3;echo -ne "`date +%s`\r";\
sleep 3;echo -e "\n`date +%s`";\
sleep 3;echo `date +%s`;\
sleep 3) | sed -e 's/\r/\n/g'
Replace sed with other filters. We should be able to run without filtering and see color first -- teeing to a log file. Then run the filter on the log file to see color and more lines.
After a bit of tinkering, the easiest option appears to be the pexpect
python module, which provides both the terminal emulation and filtering capabilities in a very compact form. Will post an example soon.
Here is a truly minimal example using pexpect
#!/usr/bin/env python
import pexpect
cmd='ninja'
cmd='cmake --build . -j 56 --verbose '
fout = open('mylog.txt','w')
child = pexpect.spawn(cmd)
while True:
index = child.expect ([
'\r\n',
'\r',
pexpect.EOF,
pexpect.TIMEOUT
])
# print(index)
if index == 0:
text=child.before.decode('utf-8')
print(text)
print(text, file=fout)
elif index == 1:
if (child.before):
text=child.before.decode('utf-8')
print(text)
print(text, file=fout)
else:
break
It doesn't consume stdin
. Running
pty_pexpect.py
echo hi;
echo there
as a script (/test_py_expect.sh
) results in:
I've been busy with other things -- I've worked on this a little and am still looking at it... The script seems to do what we want in the least amount of lines and is straight foward and understandable. The issues I have are:
/home/ron/work/DUNEPrj
ron@ronlap77 :^) pytee.py echo hi | tail
Traceback (most recent call last):
File "./pytee.py", line 69, in <module>
run()
File "/usr/lib/python3.8/site-packages/click/core.py", line 829, in __call__
return self.main(*args, **kwargs)
File "/usr/lib/python3.8/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/usr/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/lib/python3.8/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "./pytee.py", line 30, in run
cols, rows = os.get_terminal_size()
OSError: [Errno 25] Inappropriate ioctl for device
--2021-03-09_09:58:15_CST--
Pengfei and I have verified that one will not be able to do something like dbt-build.sh --clean --install | tee myout
OR dbt-build.sh --clean -install | less
This is an example of the kind of thing that can happen as a result of # 1.
The "pty issue" and be either handled by a python script that fits into a shell variable or a pypty.py script and the same is true for the "filter issue."
I would rather just be able to push a branch instead of forking.
Hi @ron003,
Here some answers:
pytee.py
(suggestions for a better names are welcome) is not meant to be generic, but exists in support of the daq-buildtools where we control the python environment. click
is already provided, pexpect
will be added in the next release.Cheers,
Alessandro
Using
script
is fundamentally flawed because it sends output to stdout and to a file so the output going to the file can only be filtered inside of the command line given in its command option. When filtering is done inside the command option, the command ahead of the filter (the CMake command in our case) will no longer consider its output to be atty
(which would be the point of usingscript
).Of the other options explored (
socat
andunbuffer
), onlyscript
reads from stdin (apparently assuming the command given to it wants stdin). An undesirable workaround would be to have the command be executed after dbt-build.sh in a paste buffer to be part of a pipeline or escaped command line:In the above
echo there
will be consumed byscript
inside ofdbt-build.sh
.I understand that the other options are not optimal because the rpm (packages) that provides them may not be installed. To that end, I have created to small python scripts which could be added directly to
dbt-build.sh
. For example,script
can be replaced by:and used:
With the above, the output will contain color and a "single progress line" (which will need filtering to turn into multi-line progress). There is a fundamental problem when trying to use
sed
to accomplish this filtering.Sed
is line (that isline feed
and notcarriage return
) oriented. NOTE: the currentsed
including in... ${cmd} |& sed -e 's/\r/\n/g'
is actually not doing anything becausecmake
, being piped, stops sending\r
. In other words, the current script implementation is basically equivalent to${cmd} | tee $build_log
. Of the 3 major filter commands (sed, Perl, awk), only awk can do this because it supports "Record Separator" that can be multiple characters, i.e. bothcarriage return
andline feed
. However, there is more to discuss here, but because this description is getting long, I'll save this for later. Just know, I believe I have a small python script (similar to the above) which I believe will do what is required.Speaking of requirements, here's a list, as I currently see things (and after some discussion with John and Alessandro) (not prioritized):
less
with the -R option handles color)dbt-build.sh
should not "eat" successive lines pasted; don't require a big pipeline.