jtmoon79 / super-speedy-syslog-searcher

Speedily search and merge log messages by datetime
MIT License
38 stars 1 forks source link

support for datetime format with only day time #102

Open jtmoon79 opened 1 year ago

jtmoon79 commented 1 year ago

Summary

Red Hat Enterprise Linux has several files that use only day time, HMS, for log datetime stamp. Mostly from program anaconda.

For example, file /var/log/anaconda/syslog from Red Hat Enterprise 9.1,

$ head "/var/log/anaconda/syslog"
20:47:01,527 NOTICE kernel:Linux version 5.14.0-162.6.1.el9_1.x86_64 (mockbuild@x86-vm-07.build.eng.bos.redhat.com) (gcc (GCC) 11.3.1 20220421 (Red Hat 11.3.1-2), GNU ld version 2.35.2-24.el9) #1 SMP PREEMPT_DYNAMIC Fri Sep 30 07:36:03 EDT 2022
20:47:01,527 NOTICE kernel:The list of certified hardware and cloud instances for Red Hat Enterprise Linux 9 can be viewed at the Red Hat Ecosystem Catalog, https://catalog.redhat.com.
20:47:01,527 INFO kernel:Command line: BOOT_IMAGE=vmlinuz initrd=initrd.img inst.stage2=hd:LABEL=RHEL-9-1-0-BaseOS-x86_64 quiet
20:47:01,527 WARNING kernel:[Firmware Bug]: TSC doesn't count with P0 frequency!
20:47:01,527 INFO kernel:x86/fpu: x87 FPU will use FXSAVE
20:47:01,527 INFO kernel:signal: max sigframe size: 1440
20:47:01,527 INFO kernel:BIOS-provided physical RAM map:
20:47:01,527 INFO kernel:BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable
20:47:01,527 INFO kernel:BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved
20:47:01,527 INFO kernel:BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved
20:47:01,527 INFO kernel:BIOS-e820: [mem 0x0000000000100000-0x000000007ffeffff] usable
20:47:01,527 INFO kernel:BIOS-e820: [mem 0x000000007fff0000-0x000000007fffffff] ACPI data
20:47:01,527 INFO kernel:BIOS-e820: [mem 0x00000000fec00000-0x00000000fec00fff] reserved
20:47:01,527 INFO kernel:BIOS-e820: [mem 0x00000000fee00000-0x00000000fee00fff] reserved
20:47:01,527 INFO kernel:BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved
20:47:01,527 INFO kernel:NX (Execute Disable) protection: active
20:47:01,527 INFO kernel:SMBIOS 2.5 present.
20:47:01,527 INFO kernel:DMI: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
20:47:01,527 INFO kernel:Hypervisor detected: KVM
20:47:01,527 INFO kernel:kvm-clock: Using msrs 4b564d01 and 4b564d00
20:47:01,527 INFO kernel:kvm-clock: using sched offset of 203173061208 cycles
20:47:01,527 INFO kernel:clocksource: kvm-clock: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns
20:47:01,527 INFO kernel:tsc: Detected 4491.488 MHz processor
20:47:01,527 DEBUG kernel:e820: update [mem 0x00000000-0x00000fff] usable ==> reserved
20:47:01,527 DEBUG kernel:e820: remove [mem 0x000a0000-0x000fffff] usable
20:47:01,527 INFO kernel:last_pfn = 0x7fff0 max_arch_pfn = 0x400000000
20:47:01,527 INFO kernel:Disabled
20:47:01,527 INFO kernel:x86/PAT: MTRRs disabled, skipping PAT initialization too.
20:47:01,527 INFO kernel:CPU MTRRs all blank - virtualized system.
20:47:01,527 INFO kernel:x86/PAT: Configuration [0-7]: WB  WT  UC- UC  WB  WT  UC- UC
20:47:01,527 INFO kernel:found SMP MP-table at [mem 0x0009fff0-0x0009ffff]
20:47:01,527 INFO kernel:RAMDISK: [mem 0x7a775000-0x7ffeffff]
20:47:01,527 INFO kernel:ACPI: Early table checksum verification disabled
20:47:01,527 INFO kernel:ACPI: RSDP 0x00000000000E0000 000024 (v02 VBOX  )
20:47:01,527 INFO kernel:ACPI: XSDT 0x000000007FFF0030 00003C (v01 VBOX   VBOXXSDT 00000001 ASL  00000061)
20:47:01,527 INFO kernel:ACPI: FACP 0x000000007FFF00F0 0000F4 (v04 VBOX   VBOXFACP 00000001 ASL  00000061)
20:47:01,527 INFO kernel:ACPI: DSDT 0x000000007FFF0610 002353 (v02 VBOX   VBOXBIOS 00000002 INTL 20100528)
20:47:01,527 INFO kernel:ACPI: FACS 0x000000007FFF0200 000040
20:47:01,527 INFO kernel:ACPI: FACS 0x000000007FFF0200 000040
20:47:01,527 INFO kernel:ACPI: APIC 0x000000007FFF0240 00005C (v02 VBOX   VBOXAPIC 00000001 ASL  00000061)
20:47:01,527 INFO kernel:ACPI: SSDT 0x000000007FFF02A0 00036C (v01 VBOX   VBOXCPUT 00000002 INTL 20100528)
20:47:01,527 INFO kernel:ACPI: Reserving FACP table memory at [mem 0x7fff00f0-0x7fff01e3]
20:47:01,527 INFO kernel:ACPI: Reserving DSDT table memory at [mem 0x7fff0610-0x7fff2962]
20:47:01,527 INFO kernel:ACPI: Reserving FACS table memory at [mem 0x7fff0200-0x7fff023f]
20:47:01,527 INFO kernel:ACPI: Reserving FACS table memory at [mem 0x7fff0200-0x7fff023f]
20:47:01,527 INFO kernel:ACPI: Reserving APIC table memory at [mem 0x7fff0240-0x7fff029b]
20:47:01,527 INFO kernel:ACPI: Reserving SSDT table memory at [mem 0x7fff02a0-0x7fff060b]
20:47:01,527 INFO kernel:No NUMA configuration found
20:47:01,527 INFO kernel:Faking a node at [mem 0x0000000000000000-0x000000007ffeffff]
20:47:01,527 INFO kernel:NODE_DATA(0) allocated [mem 0x7a74a000-0x7a774fff]
jtmoon79 commented 1 year ago

I can't imagine any reliable way to derive the date from these messages. Except only if the log file has a modified date from the current day. But even then, jumps backward in time cannot be certain about which day they belong to.

For example, from Cent OS 7 file /var/log/anaconda/ifcfg.log

21:24:50,408 DEBUG ifcfg: IfcfFile.read /etc/sysconfig/network-scripts/ifcfg-enp0s3
17:36:29,766 DEBUG ifcfg: IfcfFile.read /etc/sysconfig/network-scripts/ifcfg-enp0s3

The files modified time is Apr 10 17:36:30 2023. Some guess can be made that the last message corresponds to the file modified time since they both are the same hour, minute, and second. But the earlier message at 21:24:50,408 could be of any date preceding it.

jtmoon79 commented 1 year ago

I review a bunch of files like this one Red Hat Enterprise Linux 9.1 and the timestamp jumps are either very small or very large. This is possible to do.

Re-opening.

This could be done using file Modified Time association (similar to #110 handling offset format).

  1. get file Modified DateTime
  2. pair Modified DateTime to last log message with HMS timestamp
  3. walk backwards the entire file (similar to what is done for datetimestamp without a year) a. if there is a large jump in the timestamp then abandon further processing, print one error, and note error in --summary output

Example files

Files reviewed on Red Hat Enterprise Linux 9.1 were:

/var/log/anaconda/anaconda.log which is challenging because it has messages with a datetimestamp (look for System time set to Mon Apr 10 20:48:33 2023 UTC) and the timestamps jump backwards at that same moment.

$ head /var/log/anaconda/anaconda.log
20:47:09,203 INF core.configuration.product: Loading information about products from /etc/anaconda/product.d.
20:47:09,206 INF core.configuration.product: Found Fedora Workstation Live at /etc/anaconda/product.d/fedora-workstation-live.conf.
20:47:09,206 INF core.configuration.product: Found Fedora Workstation at /etc/anaconda/product.d/fedora-workstation.conf.
20:47:09,206 INF core.configuration.product: Found CentOS Linux at /etc/anaconda/product.d/centos.conf.
20:47:09,206 INF core.configuration.product: Found Red Hat Enterprise Linux at /etc/anaconda/product.d/rhel.conf.
20:47:09,206 INF core.configuration.product: Found Fedora-IoT IoT at /etc/anaconda/product.d/fedora-iot.conf.
20:47:09,206 INF core.configuration.product: Found Fedora Silverblue at /etc/anaconda/product.d/fedora-silverblue.conf.
20:47:09,206 INF core.configuration.product: Found Fedora Server at /etc/anaconda/product.d/fedora-server.conf.
20:47:09,206 INF core.configuration.product: Found Scientific Linux at /etc/anaconda/product.d/scientific-linux.conf.
20:47:09,206 INF core.configuration.product: Found Virtuozzo Linux at /etc/anaconda/product.d/virtuozzo-linux.conf.
...
20:48:33,303 INF lifecycle: Module initialized: StorageSpoke
20:48:33,303 INF threading: Thread Done: AnaStorageWatcher (139679088047680)
20:48:33,317 INF lifecycle: Module initialized: FilterSpoke
20:48:33,323 WRN misc: /usr/share/anaconda/addons/com_redhat_kdump/gui/spokes/kdump.py:92: PyGTKDeprecationWarning: Using positional arguments with the GObject constructor has been deprecated. Please specify keyword(s) for "value, lower, upper, step_increment, page_increment, page_size" or use a class specific constructor. See: https://wiki.gnome.org/PyGObject/InitializerDeprecations
  adjustment = Gtk.Adjustment(lower, lower, upper, step, step, 0)

20:48:33,344 DBG ui.gui.spokes.network: adding device configuration: NetworkDeviceConfiguration(connection_uuid='52c15301-8e87-3036-bc6b-8f5fb7a87dc1', device_name='enp0s3', device_type=1)
20:48:33,344 INF lifecycle: Module initialized: NetworkSpoke
20:48:33,361 INF misc: OSCAP Addon: Defaulting to local content
20:48:33,366 INF threading: Running Thread: OSCAPguiWaitForDataFetchThread (139679088047680)
20:48:33,366 INF misc: OSCAP Addon: started to look at the content
20:48:33,366 WRN misc: /usr/lib64/python3.9/subprocess.py:941: RuntimeWarning: line buffering (buffering=1) isn't supported in binary mode, the default buffer size will be used
  self.stdout = io.open(c2pread, 'rb', bufsize)

20:48:33,376 INF lifecycle: Module initialized: PasswordSpoke
20:48:33,383 DBG ui.gui.hubs: incomplete spokes: [SoftwareSelectionSpoke, StorageSpoke, PasswordSpoke]
20:48:33,405 INF lifecycle: Module initialized: UserSpoke
20:48:33,418 DBG ui.gui.hubs: incomplete spokes: [SoftwareSelectionSpoke, StorageSpoke, PasswordSpoke, UserSpoke]
20:48:33,418 INF lifecycle: Initialization of all modules (12) has been started.
20:48:33,418 DBG ui.gui.hubs: Starting event loop for hub SummaryHub
20:48:33,427 DBG ui.common: Entered hub: SummaryHub
20:48:33,427 DBG ui.gui.utils: Finished applying <bound method AddLayoutDialog._addLayout of <pyanaconda.ui.gui.spokes.keyboard.AddLayoutDialog object at 0x7f09ac224a90>> on <list object at 0x7f09a72b9580>
20:48:33,427 INF threading: Thread Done: AnaAddLayoutsInitThread (139679104833088)
20:48:33,467 INF lifecycle: Module initialized: KeyboardSpoke
20:48:33,467 INF threading: Thread Done: AnaKeyboardThread (139679450003008)
16:48:33,001 INF isys: System time set to Mon Apr 10 20:48:33 2023 UTC
16:48:33,000 INF isys: System time set to Mon Apr 10 20:48:33 2023 UTC
16:48:33,000 INF isys: System time set to Mon Apr 10 20:48:33 2023 UTC
16:48:33,000 INF lifecycle: Module initialized: DatetimeSpoke
16:48:33,000 INF threading: Thread Done: AnaDateTimeThread (139679531255360)
16:48:33,007 DBG ui.gui.hubs: spoke is not ready: KeyboardSpoke
16:48:33,007 DBG ui.gui.hubs: setting KeyboardSpoke status to: Getting list of layouts...
16:48:33,017 DBG ui.gui.hubs: incomplete spokes: [StorageSpoke, PasswordSpoke, UserSpoke]
16:48:33,017 DBG ui.gui.hubs: spoke is not ready: SoftwareSelectionSpoke
16:48:33,017 DBG ui.gui.hubs: spoke is not ready: SourceSpoke
16:48:33,017 DBG ui.gui.hubs: setting SourceSpoke status to: Setting up installation source...
16:48:33,017 DBG ui.gui.hubs: setting SourceSpoke status to: Probing storage...
16:48:33,017 DBG ui.gui.hubs: setting SourceSpoke status to: Downloading package metadata...
16:48:33,017 DBG ui.gui.hubs: setting SourceSpoke status to: Downloading group metadata...
16:48:33,017 DBG ui.gui.hubs: spoke is ready: SoftwareSelectionSpoke
16:48:33,042 DBG ui.gui.hubs: spoke is ready: SourceSpoke
16:48:33,042 DBG ui.gui.hubs: setting SoftwareSelectionSpoke status to: Downloading package metadata...
16:48:33,042 DBG ui.gui.hubs: setting SoftwareSelectionSpoke status to: Downloading group metadata...
16:48:33,042 DBG ui.gui.hubs: spoke is not ready: SoftwareSelectionSpoke
16:48:33,062 DBG ui.gui.hubs: spoke is not ready: SourceSpoke
16:48:33,062 DBG ui.gui.hubs: setting SoftwareSelectionSpoke status to: Checking software dependencies...
16:48:33,062 DBG ui.gui.hubs: setting StorageSpoke status to: Probing storage...
16:48:33,062 DBG ui.gui.hubs: setting StorageSpoke status to: Probing storage...
16:48:33,075 DBG ui.gui.hubs: spoke is ready: SoftwareSelectionSpoke
16:48:33,083 DBG ui.gui.hubs: spoke is ready: SourceSpoke
16:48:33,083 DBG ui.gui.hubs: spoke is ready: SoftwareSelectionSpoke
16:48:33,100 DBG ui.gui.hubs: spoke is ready: StorageSpoke
16:48:33,100 DBG ui.gui.hubs: setting OSCAPSpoke status to: Fetching content data

Timestamp jumps

The timestamps seem to strongly cluster. Many messages within a few minutes then none for hours. This means timestamps that are "near" each other are very likely to be from the same date (or preceding date if it cross 00:00:00 HMS boundary). This tendency could be used to determine 3a. above.

For example, in /var/log/anaconda/syslog the first log message is at 20:47:01,527 and the last 20:57:03,141.

In /var/log/anaconda/anaconda.log the first log message is at 20:47:09,203 then many message within the next 60 seconds. Then a transition from timestamp 20:48:33,467 to 16:48:33,001. The last log message is at 16:57:03,146. There is a smaller gap from 16:52:08,659 to 16:56:17,417.

Helpful command

$ cat /var/log/anaconda/syslog | grep -oEe '^[[:digit:]]{2}:[[:digit:]]{2}:[[:digit:]]{2},[[:digit:]]{3}' | uniq