Open kbr-scylla opened 1 year ago
In fact, I think recall that in the past, SCT artifacts contained original journald Scylla logs, no? But nowadays we no longer have them. Maybe I recall wrong. But this is what I need -- original Scylla logs, with timestamps assigned by Scylla using Scylla machine clock at the moment of logging the event, not some arbitrary time later when the log message is sent to a remote machine.
It looks like at some point we switched from rsyslog to syslog-ng, maybe I remember how logging worked from rsyslog times?
Is it possible to configure the test I'm running to use rsyslog like in the past?
I found this in configure_syslogng_target_script
:
if grep -P "keep-timestamp\\([^)]+\\)" /etc/syslog-ng/syslog-ng.conf; then
sed -i -r "s/keep-timestamp([ ]*yes[ ]*)/keep-timestamp(no)/g" /etc/syslog-ng/syslog-ng.conf
else
sed -i -r "s/([ \t]*options[ \t]*\\\\{{)/\\\\1\\n keep-timestamp(no);\\n/g" /etc/syslog-ng/syslog-ng.conf
fi
Why are we setting keep-timestamp=no? https://www.syslog-ng.com/technical-documents/doc/syslog-ng-open-source-edition/3.16/administration-guide/keep-timestamp
If disabled, the time of reception will be used instead.
I need all log messages. I cannot afford losing log messages, because it makes debugging borderline impossible. Log messages need to be saved by Scylla process locally on the Scylla machine disk. Then this log file needs to be collected as an artifact
See system.log
inside each node logs bundle.
Sometimes it is not provided (in case node was evicted before test end).
2023-10-07T10:46:17.572+00:00 longevity-ParallelAddDropColumnMonk-db-node-9f0e827b-2 !ERR | syslog-ng[4612]: EOF occurred while idle; fd='37'
Is new, might be related to recent change where we added more precision to timestamps
I found this in
configure_syslogng_target_script
:if grep -P "keep-timestamp\\([^)]+\\)" /etc/syslog-ng/syslog-ng.conf; then sed -i -r "s/keep-timestamp([ ]*yes[ ]*)/keep-timestamp(no)/g" /etc/syslog-ng/syslog-ng.conf else sed -i -r "s/([ \t]*options[ \t]*\\\\{{)/\\\\1\\n keep-timestamp(no);\\n/g" /etc/syslog-ng/syslog-ng.conf fi
Why are we setting keep-timestamp=no? https://www.syslog-ng.com/technical-documents/doc/syslog-ng-open-source-edition/3.16/administration-guide/keep-timestamp
If disabled, the time of reception will be used instead.
@dkropachev do you remember why we set keep-timestamp to 'no'?
See system.log inside each node logs bundle.
Hm, indeed the run I'm looking at now has this, and it looks like what I need more or less, the timestamps are different than the ones in messages.log so I'm guessing they are Scylla timestamps. The only downside being that system.log doesn't have millisecond resolution, can we change it to provide milliseconds like messages.log did?
See system.log inside each node logs bundle.
Hm, indeed the run I'm looking at now has this, and it looks like what I need more or less, the timestamps are different than the ones in messages.log so I'm guessing they are Scylla timestamps. The only downside being that system.log doesn't have millisecond resolution, can we change it to provide milliseconds like messages.log did?
it was recently added #6657
Possibly https://github.com/scylladb/scylla-cluster-tests/pull/6657 introduced regression or this test is somehow specific that led to this error: syslog-ng[4612]: EOF occurred while idle; fd='37'
@kbr-scylla was there problem with too many open file descriptors, network interruptions?
Those usually mean we were sending out too much logs, I.e. Scylla was generating too much logs to be sent via network.
SCT needs the log at runtime where the test is running
We are not gonna change that, we are collecting at the end of the test, but nodes that destroyed we don't (i.e. decommission, removed nodes and such)
We could add log collection to those, and that should be enough
@soyacz
Possibly https://github.com/scylladb/scylla-cluster-tests/pull/6657 introduced regression or this test is somehow specific that led to this error: syslog-ng[4612]: EOF occurred while idle; fd='37'
I have a run from before #6657 which also has this problem: https://jenkins.scylladb.com/job/scylla-staging/job/KamilBraun/job/longevity-schema-changes-3h-test-artsiom_mishuta-clone/6/consoleText https://cloudius-jenkins-test.s3.amazonaws.com/13f60000-7013-4d3c-9973-e8ecf7fb94b8/20230929_194957/db-cluster-13f60000.tar.gz
so #6657 is not related.
The second message just after the EOF one is:
2023-09-29T16:48:54+00:00 longevity-ParallelAddDropColumnMonk-db-node-13f60000-1 !NOTICE | syslog-ng[4613]: Syslog connection closed; fd='37', server='AF_INET(10.12.3.77:32768)', time_reopen='60'
I think it's just transient network failures, which is expected (networks are unreliable), we should probably accept that they happen and deal with them reliably.
@fruch
SCT needs the log at runtime where the test is running
But shouldn't logs be buffered, and the buffer sent once syslog-ng reconnects?
@soyacz
Possibly https://github.com/scylladb/scylla-cluster-tests/pull/6657 introduced regression or this test is somehow specific that led to this error: syslog-ng[4612]: EOF occurred while idle; fd='37'
I have a run from before #6657 which also has this problem: https://jenkins.scylladb.com/job/scylla-staging/job/KamilBraun/job/longevity-schema-changes-3h-test-artsiom_mishuta-clone/6/consoleText https://cloudius-jenkins-test.s3.amazonaws.com/13f60000-7013-4d3c-9973-e8ecf7fb94b8/20230929_194957/db-cluster-13f60000.tar.gz
so #6657 is not related.
The second message just after the EOF one is:
2023-09-29T16:48:54+00:00 longevity-ParallelAddDropColumnMonk-db-node-13f60000-1 !NOTICE | syslog-ng[4613]: Syslog connection closed; fd='37', server='AF_INET(10.12.3.77:32768)', time_reopen='60'
I think it's just transient network failures, which is expected (networks are unreliable), we should probably accept that they happen and deal with them reliably.
@fruch
SCT needs the log at runtime where the test is running
But shouldn't logs be buffered, and the buffer sent once syslog-ng reconnects?
We don't set the buffer to a very big size, We know that in cases of scylla spamming with thousands or more per second those buffers aren't enough.
Maybe we should play a bit and enlarge them, but we should be addressing the scylla issues causing such bursts
TL;DR can I have logs gathered locally from Scylla process locally on the Scylla machine with timestamps assigned by Scylla using Scylla machine clock?
https://github.com/scylladb/scylla-cluster-tests/commit/6558f89b02598dcaeeb639f52ea0e234bd2321c6 is fixing this request.
As to setting original time in 'messages.log'
Why are we setting keep-timestamp=no?
I was thinking about it, and possible reason is to have time synchronized for multi-dc scenarios
Because of that and there's not much we can do about syslog-ng issues in very high traffic circumstances, which usually don't show up, can we consider this issue as closed?
I was thinking about it, and possible reason is to have time synchronized for multi-dc scenarios
Multi-DC makes things even worse for client-side timestamps, the timestamps will be even more desynchronized due to latencies of transmitting the log message.
It may still be better to use server-side (Scylla) timestamps, clocks should be synchronized by NTP anyway which should be much better than adding transmit latency to each message timestamp.
Because of that and there's not much we can do about syslog-ng issues in very high traffic circumstances, which usually don't show up
This issue happens on every run of the SCT job I'm using to debug one issue, and it affected my ability to debug it. (Until I found about system.log, which seems to solve my problem, although I'm worried about system.log being lost in some cases.)
How have you chosen the buffer size? Perhaps it's too small. The nodes are not spamming logs that much that we should start losing them after a 30 second connectivity pause.
I was thinking about it, and possible reason is to have time synchronized for multi-dc scenarios
Multi-DC makes things even worse for client-side timestamps, the timestamps will be even more desynchronized due to latencies of transmitting the log message.
It may still be better to use server-side (Scylla) timestamps, clocks should be synchronized by NTP anyway which should be much better than adding transmit latency to each message timestamp.
Because of that and there's not much we can do about syslog-ng issues in very high traffic circumstances, which usually don't show up
This issue happens on every run of the SCT job I'm using to debug one issue, and it affected my ability to debug it. (Until I found about system.log, which seems to solve my problem, although I'm worried about system.log being lost in some cases.)
I agree that if we need the system.log (for whatever reason, timestamp is only one of the reason we want, we also filter some logs out, i.e. logs that SCT doesn't need, non scylla logs for example)
we do need to make sure that we have it collected also on cases that the test is taking down a node.
How have you chosen the buffer size? Perhaps it's too small. The nodes are not spamming logs that much that we should start losing them after a 30 second connectivity pause.
I don't which calculation was done to select the size, so from my POV it wasn't, and we can re-open it and pick a different probably higher number
given that, I think we should open an issue of each of those, and discuss how to make them happen each on it's own
@soyacz can you raise the issues from above, and put them on our board ?
Another occurrence of the issue, which introduced false positive in the test result during longevity-schema-topology-changes-12h-test.
The test issues nodetool rebuild
command during AddRemoveDc Nemesis and then waits for rebuild.*started
kind of log entries to appear, which usually appear within seconds (in this particular test there were 2 other successful rebuilds, where the log records appeared within 5 seconds). Around the same time syslong losts connection and waits for reconnect for 60s:
│2024-08-30T04:45:19.474+00:00 longevity-parallel-topology-schema--db-node-a507ebff-8 !ERR | syslog-ng[4720]: EOF occurred while idle; fd='16' │
│2024-08-30T04:45:19.474+00:00 longevity-parallel-topology-schema--db-node-a507ebff-8 !NOTICE | syslog-ng[4720]: Syslog connection closed; fd='16', server='AF_INET(10.12.8.168:32768)', time_reopen='60'
The Nemesis waited for the logs record to appear and timed out.
Another occurrence of the issue, which introduced false positive in the test result during longevity-schema-topology-changes-12h-test. The test issues
nodetool rebuild
command during AddRemoveDc Nemesis and then waits forrebuild.*started
kind of log entries to appear, which usually appear within seconds (in this particular test there were 2 other successful rebuilds, where the log records appeared within 5 seconds). Around the same time syslong losts connection and waits for reconnect for 60s:│2024-08-30T04:45:19.474+00:00 longevity-parallel-topology-schema--db-node-a507ebff-8 !ERR | syslog-ng[4720]: EOF occurred while idle; fd='16' │ │2024-08-30T04:45:19.474+00:00 longevity-parallel-topology-schema--db-node-a507ebff-8 !NOTICE | syslog-ng[4720]: Syslog connection closed; fd='16', server='AF_INET(10.12.8.168:32768)', time_reopen='60'
The Nemesis waited for the logs record to appear and timed out.
After the 60s, logs did appear ? I.e. if the test waited 2 minutes more it would find it ? or it wasn't there at all ?
We nowadays have metric to see if we had logs dropped or not (I think it's should be in 2024.1)
With https://github.com/scylladb/scylla-cluster-tests/issues/6688 I'm not sure if it even makes sense to collect messages.log.
Debugging Scylla from SCT run logs is broken because:
TL;DR can I have logs gathered locally from Scylla process locally on the Scylla machine with timestamps assigned by Scylla using Scylla machine clock?
So it looks like messages are gathered by SCT runner machine remotely through
syslog-ng
. The runner connects to Scylla machine and then Scylla sends it logs directly to a socket opened bysyslog-ng
(?) and it looks like timestamps are assigned on the runner machine, by syslog-ng, not by Scylla.This causes logging to be broken and useless. (Ok, it's not completely useless, but much less useful than it could be.)
For example consider these logs: https://cloudius-jenkins-test.s3.amazonaws.com/9f0e827b-8a9a-4767-b4b4-0582d3f9415f/20231007_113814/db-cluster-9f0e827b.tar.gz
Here's one event:
syslog-ng temporarily lost connection to Scylla machine. Then it reconnected, but we lost all messages from the period when the connection was lost.
Also, the timestamps don't make sense. Last timestamp before connection drop:
10:46:17.572
, first timestamp after connection was restored:10:46:17.572
. Those are not real Scylla timestamps. These are timestamps from the SCT runner machine, not from Scylla. I determined by looking at other Scylla events that about 30 seconds have passed between the two events on Scylla machine.I need timestamps assigned by Scylla clock, on the Scylla machine from the moment the Scylla log message is logged, not by syslog-ng from the moment the log message is gathered. The latter doesn't make any sense at all.
another event causing log message loss:
I need all log messages. I cannot afford losing log messages, because it makes debugging borderline impossible. Log messages need to be saved by Scylla process locally on the Scylla machine disk. Then this log file needs to be collected as an artifact. Or it can be collected by syslog-ng, but syslog-ng has to start from the point when it last lost connection, not from current moment in time. And the timestamps need to be assigned by Scylla.
In fact, I think recall that in the past, SCT artifacts contained original journald Scylla logs, no? But nowadays we no longer have them. Maybe I recall wrong. But this is what I need -- original Scylla logs, with timestamps assigned by Scylla using Scylla machine clock at the moment of logging the event, not some arbitrary time later when the log message is sent to a remote machine.
cc @scylladb/qa-maintainers @temichus @soyacz @kostja @avikivity