[RFE] speedup journal export

champtar commented 6 months ago

While generating sos reports, journal exports takes pretty long Using journalctl --reverse we can get pretty significant speedups

# time sh -c "journalctl | tail -c 100m > /dev/null"
real    1m55.815s
user    1m48.612s
sys 0m7.896s

# time sh -c "journalctl --reverse | head -c 100m | tac > /dev/null"
real    0m8.674s
user    0m8.399s
sys 0m0.432s

# journalctl --disk-usage 
Archived and active journals take up 3.9G in the file system.

jcastill commented 6 months ago

Hi @champtar . I get the same or similar times when running in different RHELs and Fedoras, with different journal sizes. What can you tell us about the machine where these commands were run?

pmoravec commented 6 months ago

Innovative idea, I like that approach of thinking. Comparing it with the call flow of a command execution:

we collect their outputs via https://github.com/sosreport/sos/blob/main/sos/report/plugins/__init__.py#L2346-L2351 (see sizelimit parameter there)
we asynchronously wait for the execution at https://github.com/sosreport/sos/blob/main/sos/utilities.py#L282 - here we can't pass pipes in the commands
so just for journalctl, we would have to "concatenate" individual commands (journalctl --reverse and head and tac) invoked via Popen (or use some different mechanism). And we would have to ensure this fragile concatenation will always work (there have been use cases where various commands get stuck in various ways and sos had to cope with that without getting stuck)

I am bit afraid most of implementations would suffer to their complexity or insufficient robustness due to above facts. If somebody does see some good implementation, I would be definitely interested in.

jcastill commented 6 months ago

Unless I'm mistaken, using tail or heads after calling journalctl will still load the whole file, which may be a waste. We have already --size, wouldn't that work better than having to use 'tail' or 'head' on the full file? Assuming that the time to capture the log is proportional to the size, we could capture the last 24 or 12 hours of logs and then apply size limits on top of that. In other words:

--all-logs gets everything, we don't need to filter. Normal execution without --all-logs: If journal log size is bigger than 1.5G (random number here, we can change it): capture last 24 hours of logs and apply size limit to 500M or whatever we decide. Else: capture the whole file.

champtar commented 6 months ago

@jcastill

I get the same or similar times when running in different RHELs and Fedoras, with different journal sizes. What can you tell us about the machine where these commands were run?

I'm extremely surprise by your results, my tests always show differences. This run was on an Alma 9.3 server, bare metal, SSD, XFS, but affinity limited to 1 core

Some other numbers (running each commands twice, showing the second run):

On a small Fedora 39 VPS ``` # time sh -c "journalctl | tail -c 100m > /dev/null" real 0m46,968s user 0m36,875s sys 0m9,878s # time sh -c "journalctl --reverse | head -c 100m | tac > /dev/null" real 0m18,551s user 0m14,397s sys 0m4,070s # journalctl --disk-usage Archived and active journals take up 886.7M in the file system. ``` On an 8 years old laptop, Fedora 39, SSD, encrypted disk, ext4 ``` $ time sh -c "journalctl | tail -c 100m > /dev/null" real 19m13,850s user 10m36,594s sys 8m33,288s $ time sh -c "journalctl --reverse | head -c 100m | tac > /dev/null" real 5m33,438s user 3m4,797s sys 2m28,186s $ journalctl --disk-usage Archived and active journals take up 1.3G in the file system. ``` On an old desktop acting as NAS, SSD, ``` # time sh -c "journalctl | tail -c 100m > /dev/null" real 10m21,551s user 5m0,823s sys 5m20,703s # time sh -c "journalctl --reverse | head -c 100m | tac > /dev/null" real 2m19,282s user 1m7,450s sys 1m12,177s # journalctl --disk-usage Archived and active journals take up 3.3G in the file system. ```

Unless I'm mistaken, using tail or heads after calling journalctl will still load the whole file

You are mistaken :) I don't know how --reverse is implemented, but head will close it stdin as soon as it reaches the limit, so journalctl will exit, thus it'll load only the latest journal chunks, not all of them. Using tail you must read everything from disk and decompress it, the difference can be huge.

@pmoravec

For the implementation, the quick and dirty way is to pass sh -c "journalctl --reverse | head -c 100m | tac as command, with also a suggested name

Right now sizelimit is doing a tail, keeping the whole output in memory, and if we timeout we have the earliest logs that might have been excluded with a larger timeout.

We could do it in 2 steps: 1) run journalctl --reverse with a head limit (using head or a python implementation), writing to a temp file 2) reverse the temp file (using tac or a python implementation)

If we timeout in step 1, we have the latest logs which is often what we want, and we don't need to have the full logs in memory at any point

champtar commented 6 months ago

I've played a bit with the journal cursor but it's slower (we need -n when using --cursor-file)

#!/bin/bash

cursor=$(mktemp cursor.XXXXXXXXXX)

logsize=0
while [ "$logsize" -lt 104857600 ]
do
  prevcursor="$(<$cursor)"
  ((logsize+=$(journalctl --reverse --cursor-file=$cursor -n 1000 | wc -c)))
  [ "$prevcursor" == "$(<$cursor)" ] && break
done

journalctl --cursor-file=$cursor

rm -f cursor

# time sh -c "journalctl --reverse | head -c 100m > sgol; tac sgol > logs; rm -f sgol"
real    0m8.526s
user    0m8.268s
sys 0m0.360s

# time ./export-journal.sh > logs
real    0m22.452s
user    0m18.495s
sys 0m4.524s

sosreport / sos

[RFE] speedup journal export #3615