Open champtar opened 6 months ago
Hi @champtar . I get the same or similar times when running in different RHELs and Fedoras, with different journal sizes. What can you tell us about the machine where these commands were run?
Innovative idea, I like that approach of thinking. Comparing it with the call flow of a command execution:
sizelimit
parameter there)journalctl
, we would have to "concatenate" individual commands (journalctl --reverse
and head
and tac
) invoked via Popen
(or use some different mechanism). And we would have to ensure this fragile concatenation will always work (there have been use cases where various commands get stuck in various ways and sos
had to cope with that without getting stuck)I am bit afraid most of implementations would suffer to their complexity or insufficient robustness due to above facts. If somebody does see some good implementation, I would be definitely interested in.
Unless I'm mistaken, using tail or heads after calling journalctl will still load the whole file, which may be a waste. We have already --size, wouldn't that work better than having to use 'tail' or 'head' on the full file? Assuming that the time to capture the log is proportional to the size, we could capture the last 24 or 12 hours of logs and then apply size limits on top of that. In other words:
--all-logs gets everything, we don't need to filter. Normal execution without --all-logs: If journal log size is bigger than 1.5G (random number here, we can change it): capture last 24 hours of logs and apply size limit to 500M or whatever we decide. Else: capture the whole file.
@jcastill
I get the same or similar times when running in different RHELs and Fedoras, with different journal sizes. What can you tell us about the machine where these commands were run?
I'm extremely surprise by your results, my tests always show differences. This run was on an Alma 9.3 server, bare metal, SSD, XFS, but affinity limited to 1 core
Some other numbers (running each commands twice, showing the second run):
Unless I'm mistaken, using tail or heads after calling journalctl will still load the whole file
You are mistaken :)
I don't know how --reverse
is implemented, but head
will close it stdin as soon as it reaches the limit, so journalctl
will exit, thus it'll load only the latest journal chunks, not all of them. Using tail
you must read everything from disk and decompress it, the difference can be huge.
@pmoravec
For the implementation, the quick and dirty way is to pass sh -c "journalctl --reverse | head -c 100m | tac
as command, with also a suggested name
Right now sizelimit
is doing a tail, keeping the whole output in memory, and if we timeout we have the earliest logs that might have been excluded with a larger timeout.
We could do it in 2 steps:
1) run journalctl --reverse
with a head
limit (using head or a python implementation), writing to a temp file
2) reverse the temp file (using tac or a python implementation)
If we timeout in step 1, we have the latest logs which is often what we want, and we don't need to have the full logs in memory at any point
I've played a bit with the journal cursor but it's slower
(we need -n
when using --cursor-file
)
#!/bin/bash
cursor=$(mktemp cursor.XXXXXXXXXX)
logsize=0
while [ "$logsize" -lt 104857600 ]
do
prevcursor="$(<$cursor)"
((logsize+=$(journalctl --reverse --cursor-file=$cursor -n 1000 | wc -c)))
[ "$prevcursor" == "$(<$cursor)" ] && break
done
journalctl --cursor-file=$cursor
rm -f cursor
# time sh -c "journalctl --reverse | head -c 100m > sgol; tac sgol > logs; rm -f sgol"
real 0m8.526s
user 0m8.268s
sys 0m0.360s
# time ./export-journal.sh > logs
real 0m22.452s
user 0m18.495s
sys 0m4.524s
While generating sos reports, journal exports takes pretty long Using
journalctl --reverse
we can get pretty significant speedups