mlopatkin / andlogview

Tool for viewing application logs from Android devices.
https://andlogview.mlopatkin.name/
Apache License 2.0
43 stars 6 forks source link

Gracefully handle time travels #248

Open mlopatkin opened 2 years ago

mlopatkin commented 2 years ago

There are multiple reasons for the logs to appear "out-of-order":

  1. The timestamp was only "MM-DD" until Android 7.0, so every New Year was causing logs to jump back from December to January
  2. Similar to the previous one, some devices were starting with 01.01.1970 system time and only set up a proper time a bit later, so the entries of 01-01 were visible in the logs.
  3. There are time adjustments throughout the year: daylight saving time, user changing timezones, syncing clock, etc.

The logcat itself before 5.0 was prone to printing log entries out of order, because there were separate log sources for each buffer, and logcat was doing timestamp comparison to determine which one to output first. It was solved for the logcat with the migration to a new "syslog" subsystem that stores log entries sequentially.

Andlogview has to collect logs separately in several concurrent logcat invocations, to properly assign buffer names to log entries (system, main, radio, etc). Moreover, ancient Android versions (Cupcake 1.5, Donut 1.6, and maybe Eclair) cannot print more than one buffer at a time anyway, so it is the only available approach there. This requires the tool to solve the issue similar to the logcat's one described above - combine multiple log streams based on the timestamps.

Our current approach is purely timestamp-based and doesn't take time travels into account at all, so the new log entries may be appended to the very beginning of the log. Even the ordering within a single buffer is not respected when merging. This is problematic.

This issue proposes several improvements:

  1. Never violate the order of the lines within a single buffer (let's call it the buffer order).
  2. If the timestamps contradict the buffer order, then a warning should be present to the user.
  3. (Optional) Add a heuristic to detect the New Year and skip the warning in that case.

NB: #215 tracks switching to use logcat dividers (-D), that allow to retrieve all log buffers in a single command with proper ordering coming from the system's logd. However, the switch is only available since 6.0 (Marshmallow).

This is a more elaborate version of #105 which was closed as non-actionable.

mlopatkin commented 2 years ago

It looks like the kernel logs are still problematic in this regard. I haven't checked the implementation but I can imagine these logs being copied into the logger daemon's storage by some worker with the actual timestamp. This timestamp would be out of order with the other logs:

--------- switch to kernel
2022-01-09 17:27:01.658 I/logd    (    0): logdr: UID=2000 GID=2000 PID=3496 b tail=0 logMask=99 pid=0 start=0ns deadline=0ns
--------- switch to main
2022-01-09 17:27:01.637 W/dmesg   ( 3499): type=1400 audit(0.0:14): avc: denied { syslog_read } for scontext=u:r:shell:s0 tcontext=u:r:kernel:s0 tclass=system permissive=0
--------- switch to kernel
2022-01-09 17:27:07.455 W/healthd (    0): battery l=100 v=5000 t=25.0 h=2 st=4 c=900000 fc=300000 cc=10 chg=
2022-01-09 17:27:10.981 I/logd    (    0): logdr: UID=2000 GID=2000 PID=3500 n tail=0 logMask=99 pid=0 start=0ns deadline=0ns
--------- switch to main
2022-01-09 17:27:10.953 W/dmesg   ( 3503): type=1400 audit(0.0:15): avc: denied { syslog_read } for scontext=u:r:shell:s0 tcontext=u:r:kernel:s0 tclass=system permissive=0
2022-01-09 17:27:12.782 E/android.hardware.power.stats@1.0-service-mock(  393): Failed to getEnergyData
mlopatkin commented 2 years ago

283 addressed the case when the logs were coming from a bugreport file (dumpstate). The "live" merging is still to come.

mlopatkin commented 2 years ago

Looks like time adjustments are more common than I thought. As expected, the heuristic added in #283 doesn't work well if not all buffers observe the time travel. Maybe we could do a better job by actually looking at timestamps and finding out affected timespans.