falcosecurity / libs

libsinsp, libscap, the kernel module driver, and the eBPF driver sources
https://falcosecurity.github.io/libs/
Apache License 2.0
212 stars 159 forks source link

[BUG] Timestamp incorrect in event logs, dated in the future by 7 days/1 week #932

Closed happy-dude closed 1 year ago

happy-dude commented 1 year ago

Describe the bug

While analyzing event and debug logs generated by Falco, we discovered that the events were dated in the future.

For example, we saw an debug entry on 1677527444 (around Mon, 27 Feb 2023 19:50:44 GMT), I identified the following entry with the following timestamp 2023-03-06T11:56:34.693685797Z:

{
  "hostname": "redacted",
  "output": "Falco internal: syscall event drop. 116192 system calls dropped in last second.",
  "output_fields": {
    "ebpf_enabled": "0",
    "n_drops": "116192",
    "n_drops_buffer_clone_fork_enter": "0",
    "n_drops_buffer_clone_fork_exit": "0",
    "n_drops_buffer_connect_enter": "0",
    "n_drops_buffer_connect_exit": "0",
    "n_drops_buffer_dir_file_enter": "0",
    "n_drops_buffer_dir_file_exit": "0",
    "n_drops_buffer_execve_enter": "0",
    "n_drops_buffer_execve_exit": "0",
    "n_drops_buffer_open_enter": "0",
    "n_drops_buffer_open_exit": "0",
    "n_drops_buffer_other_interest_enter": "0",
    "n_drops_buffer_other_interest_exit": "0",
    "n_drops_buffer_total": "116192",
    "n_drops_bug": "0",
    "n_drops_page_faults": "0",
    "n_drops_scratch_map": "0",
    "n_evts": "430758"
  },
  "priority": "Debug",
  "rule": "Falco internal: syscall event drop",
  "time": "2023-03-06T11:56:34.693685797Z"
}

Around an event generated on

date +%s ; date -u
1677527950
Mon 27 Feb 2023 07:59:10 PM UTC

The following event was observed with timestamp 1678109229443642000, 2023-03-06T13:27:09.443642063Z:

{
  "hostname": "redacted",
  "priority": "Warning",
  "rule": "Read sensitive file untrusted",
  "source": "syscall",
  "tags": [
    "T1020",
    "T1083",
    "T1212",
    "T1552",
    "T1555",
    "container",
    "filesystem",
    "host",
    "mitre_credential_access",
    "mitre_discovery"
  ],
  "time": "2023-03-06T13:27:09.443642063Z",
  "output_fields": {
    "container.id": "host",
    "container.image.repository": null,
    "evt.time": 1678109229443642000,
    "fd.name": "/etc/pam.d/common-session",
    "proc.aname[2]": "systemd",
    "proc.aname[3]": null,
    "proc.aname[4]": null,
    "proc.cmdline": "sshd -D -R",
    "proc.name": "sshd",
    "proc.pid": 3225785,
    "proc.pname": "sshd",
    "user.loginuid": -1,
    "user.name": "root"
  }
}

How to reproduce it

Unsure, installation was from HEAD @ 0de9af9 via systemd on-metal.

Generate events and look at log messages?

Expected behaviour

Timestamp should be accurate / not in the future.

Since the epoch times are pretty similar, 1678109229443642000 and 1677527950; perhaps the string generated in the time field is incorrect (~7 days in the future, or 6048000000000 in nanoseconds epoch)?

Environment

Additional context

Possibly related to https://github.com/falcosecurity/libs/pull/595 ? (Did a search from prior PRs related to time.)

Discussion on Slack community: https://kubernetes.slack.com/archives/CMWH3EH32/p1677517061599169

incertum commented 1 year ago

Hi Stanley, thanks for reporting this. During development of https://github.com/falcosecurity/libs/pull/595 you linked to I noticed some time travel issues as well, that is why I force changed my entire dev Linux box to UTC not just the configs and then the times made sense again.

What was changed in above PR was the derivation of the host boot ts, new approach vs we kept the old one as fall-back. Rest of the changes were around the new fields introduced so they are unrelated. The new approach to derive boot time is the same approach we use to derive the process start time when doing the initial proc scan. Boot time is involved in deriving epochs when the kernel only offers monotonic times.

Wondering if someone has more context around this?

terylt commented 1 year ago

I've seen this in using the falco libs separately from falco. Tried to reproduced in sinsp-example/scap-example from the head of libs and haven't been able to replicate currently. Will keep digging.

Andreagit97 commented 1 year ago

I've seen this in using the falco libs separately from falco. Tried to reproduced in sinsp-example/scap-example from the head of libs and haven't been able to replicate currently. Will keep digging.

Same here, never seen this issue with scap-open or sinsp-example but we can dig more into it! thank you for reporting!!

happy-dude commented 1 year ago

Hey team,

Did some preliminary investigation:

Will do some more digging as time permits. My next "excavation" will be where/how the timestamp is recorded when an event is traced/processed.

Andreagit97 commented 1 year ago

yeah probably I would try to reproduce it with the scap-open example, if we cannot see something probably we can go with the sinsp-example... I've tried both on my machine for some hours but nothing happens :( I'm pretty sure that the issue is related to what we print and not to what we collect from drivers but still not sure how to spot it :/

Andreagit97 commented 1 year ago

maybe we could also write some tests in sinsp to try to spot it :thinking: something like that https://github.com/falcosecurity/libs/blob/3ade7e25a05b3ed08fbf7cae4e2da83a1b72fb0f/userspace/libsinsp/test/events_file.ut.cpp#LL164C7-L164C7 but with precise timestamp values

happy-dude commented 1 year ago

Hey team,

Thanks for the pointers in this ticket, the Slack discussion, and the community call. I was able to reproduce the erroneous timestamp on one of my Falco-evaluation systems by building the libs and running scap-open.

Additionally, I can validate that all the hosts I am evaluating Falco on reports an incorrect timestamp; the timestamp "offset" is inconsistent and does not seem to stem from incrementing the correct timestamps by the same amount.

Here is a sample output (note: I'm not sure I'm using the scap-open arguments correctly so please correct me if I'm wrong!):

schan@host:~/libs/build$ date -u && date +%s && date +%s%N; sudo ./libscap/examples/01-open/scap-open --modern_bpf --evt_type execve --num_events 10
Wed 22 Mar 2023 08:20:11 PM UTC
1679516411
1679516411159811048

[SCAP-OPEN]: Hello!

---------------------- SCAP SOURCE ----------------------
* Modern BPF probe, 1 ring buffer every 1 CPUs
-----------------------------------------------------------

--------------------- CONFIGURATIONS ----------------------
* Print single event type: 0 (`-1` means no event to print).
* Run until '10' events are catched.
-----------------------------------------------------------

---------------------- INTERESTING SYSCALLS ----------------------
* All sc codes are enabled!
------------------------------------------------------------------

* OK! modern BPF probe correctly loaded: NO VERIFIER ISSUES :)
* Live capture in progress...
* Press CTRL+C to stop the capture

------------------ EVENT: 0 TID:612821
------ HEADER
timestamp: 1882109607583734
tid: 612821
len: 34
type: 0
num params: 2
------
------ PARAMS
PARAM 0: 16
PARAM 1: 172
------
------------------

------------------ EVENT: 0 TID:612821
------ HEADER
timestamp: 1882109607646014
tid: 612821
len: 34
type: 0
num params: 2
------
------ PARAMS
PARAM 0: 108
PARAM 1: 135
------
------------------

------------------ EVENT: 0 TID:769419
------ HEADER
timestamp: 1882109607763854
tid: 769419
len: 34
type: 0
num params: 2
------
------ PARAMS
PARAM 0: 178
PARAM 1: 115
------
------------------

---------------------- STATS -----------------------
Events correctly captured (SCAP_SUCCESS): 10
Seen by driver: 259
Time elapsed: 0 s
Number of dropped events: 0
Number of timeouts: 1
Number of 'next' calls: 11
Number of dropped events caused by full buffer (total / all buffer drops - includes all categories below, likely higher than sum of syscall categories): 0
Number of dropped events caused by full buffer (n_drops_buffer_clone_fork_enter syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_clone_fork_exit syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_execve_enter syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_execve_exit syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_connect_enter syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_connect_exit syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_open_enter syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_open_exit syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_dir_file_enter syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_dir_file_exit syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_other_interest_enter syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_other_interest_exit syscall category): 0
Number of dropped events caused by full scratch map: 0
Number of dropped events caused by invalid memory access (page faults): 0
Number of dropped events caused by an invalid condition in the kernel instrumentation (bug): 0
Number of preemptions: 0
Number of events skipped due to the tid being in a set of suppressed tids: 0
Number of threads currently being suppressed: 0
-----------------------------------------------------

[SCAP-OPEN]: Bye!

Any guidance on next steps to debug further?

EDIT: On the host above, the scap-open timestamp (e.g. 1882109607763854) is off by a pretty significant amount, ~6 years. However, the most recent event alerted by Falco has a more "sane," but still wrong, timestamp, 1681305483611993160, ~20 days.

terylt commented 1 year ago

I think a next good step, is to trace through where the ts value gets set in the scap_evt object. Could add a print statement there, to see if the timestamp changes from assignment to where it's being printed in scap-example.

Andreagit97 commented 1 year ago

Hey, @Happy-Dude this is not a bug is a feature! In this way you can read events from the future! :rofl: Apart from jokes, some minor notes:

scpa-open usage Scap-open is a really low-level tool so unfortunately, it cannot understand something like --evt_type execve after --evt_type it wants an event num, you can find them here :point_down: https://github.com/falcosecurity/libs/blob/f5412a4ced0e0b6b46b735e48d4e12900e0b0c85/driver/ppm_events_public.h#L1098 So in this case if you want the execve exit event you have to use --evt_type 293. The full command will be

sudo ./libscap/examples/01-open/scap-open --modern_bpf --evt_type 293 --num_events 10

Probably --num_events 10 is not enough to catch something interesting so feel free to bump it or to remove directly it stopping the capture with CTRL+C when you want

timestamp issue

Coming back to the issue, I've tried the same scap-open setup but again i didn't see almost anything strange :( BTW looking at your data it seems that the boot time in your case is completely wrong. The event timestamp is composed of 2 parts: the boot_time and the number of nanosecond from the boot_time. The boot_time is computed at initialization time by us, while the nanoseconds from the boot_time are provided by a bpf helper bpf_ktime_get_boot_ns

hdr->ts = maps__get_boot_time() + bpf_ktime_get_boot_ns();

Now supposing that there is no bug in the bpf helper (this doesn't seem the case because if I look at the increment of your events in the scap-open logs I can see that the increment in nanoseconds seems almost good) probably there is something wrong in the boot time computation... This is a minimal branch https://github.com/Andreagit97/libs/tree/timestamp_issue that prints some info about the boot_time when scap-open starts, you could compile it from this branch and maybe try it again, WDYT? Of course, patches are really simple https://github.com/Andreagit97/libs/commit/da7f52cb17503022cbbc8a2646d7fc6b38e38c9d so you can apply them by hand to your libs version if you prefer

Another possible test would be to use scap-open with the kmod (--kmod) in this case the event timestamp is completely computed by the kernel so in this case i don't expect any issue

FedeDP commented 1 year ago

@Happy-Dude i'd also check the output of stat /proc/1; can you share it? Thank you!

happy-dude commented 1 year ago

So in this case if you want the execve exit event you have to use --evt_type 293. The full command will be

sudo ./libscap/examples/01-open/scap-open --modern_bpf --evt_type 293

Awesome, appreciate the context + explanation and will use this command.


This is a minimal branch https://github.com/Andreagit97/libs/tree/timestamp_issue that prints some info about the boot_time when scap-open starts, you could compile it from this branch and maybe try it again, WDYT?

Just built a new scap-open and got the following output:

schan@host:~/libs/build$ sudo ./libscap/examples/01-open/scap-open --modern_bpf --evt_type 293 --num_events 75000

[SCAP-OPEN]: Hello!

---------------------- SCAP SOURCE ----------------------
* Modern BPF probe, 1 ring buffer every 1 CPUs
-----------------------------------------------------------

--------------------- CONFIGURATIONS ----------------------
* Print single event type: 293 (`-1` means no event to print).
* Run until '75000' events are catched.
-----------------------------------------------------------

---------------------- INTERESTING SYSCALLS ----------------------
* All sc codes are enabled!
------------------------------------------------------------------

************ proc_dir: /proc/1/
************ proc_dir: /proc/1/
************ boot_time: 12721000000

* OK! modern BPF probe correctly loaded: NO VERIFIER ISSUES :)
* Live capture in progress...
* Press CTRL+C to stop the capture

------------------ EVENT: 293 TID:4034839
------ HEADER
timestamp: 1946299696411786
tid: 4034839
len: 858
type: 293
num params: 27
------
------ PARAMS
PARAM 0: 0
PARAM 1: /usr/bin/dig
PARAM 2: <omitted>
PARAM 3: 4034839
PARAM 4: 4034839
PARAM 5: 3437214
PARAM 6: is empty
PARAM 7: 65536
PARAM 8: 0
PARAM 9: 4
PARAM 10: 420
PARAM 11: 4
PARAM 12: 0
PARAM 13: dig
PARAM 14: <omitted>
PARAM 15: <omitted>
PARAM 16: 0
PARAM 17: 4034839
PARAM 18: -1
PARAM 19: 0
PARAM 20: 8192
PARAM 21: 8192
PARAM 22: 8192
PARAM 23: 1375
PARAM 24: 11941000000
PARAM 25: 1674660142000000000
PARAM 26: 2138
------
------------------

<omitted some events>

---------------------- STATS -----------------------
Events correctly captured (SCAP_SUCCESS): 75000
Seen by driver: 75760
Time elapsed: 0 s
Number of dropped events: 0
Number of timeouts: 239
Number of 'next' calls: 75239
Number of dropped events caused by full buffer (total / all buffer drops - includes all categories below, likely higher than sum of syscall categories): 0
Number of dropped events caused by full buffer (n_drops_buffer_clone_fork_enter syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_clone_fork_exit syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_execve_enter syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_execve_exit syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_connect_enter syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_connect_exit syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_open_enter syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_open_exit syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_dir_file_enter syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_dir_file_exit syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_other_interest_enter syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_other_interest_exit syscall category): 0
Number of dropped events caused by full scratch map: 0
Number of dropped events caused by invalid memory access (page faults): 0
Number of dropped events caused by an invalid condition in the kernel instrumentation (bug): 0
Number of preemptions: 0
Number of events skipped due to the tid being in a set of suppressed tids: 0
Number of threads currently being suppressed: 0
-----------------------------------------------------

[SCAP-OPEN]: Bye!

i'd also check the output of stat /proc/1

No problem! See the following:

schan@host:~/libs/build$ stat /proc/1
  File: /proc/1
  Size: 0               Blocks: 0          IO Block: 1024   directory
Device: 15h/21d Inode: 20203       Links: 9
Access: (0555/dr-xr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 1970-01-01 00:00:12.721000000 +0000
Modify: 1970-01-01 00:00:12.721000000 +0000
Change: 1970-01-01 00:00:12.721000000 +0000
 Birth: -
FedeDP commented 1 year ago

So, the issue here is that /proc/1 has a super old change time (basically epoch). But i'd expect event timestamps to be relative to 1 jan 1970, not in the future. We will investigate further.

happy-dude commented 1 year ago

Awesome awesome; any further actions on my side atm?

If this might be a specific config in our environment, I can discuss with internal teams on why this might be the case.

Also forgot to answer Andrea re: kmod option; unfortunately, won't be able to do this easily on my side as loading kernel modules would require communication, approvals, etc. with my org and stakeholders.

FedeDP commented 1 year ago

Can you share uptime output? :)

happy-dude commented 1 year ago

uptime

schan@host:~/libs/build$ uptime
 14:44:46 up 22 days, 13:12,  2 users,  load average: 2.39, 2.11, 1.86
FedeDP commented 1 year ago

You know what ahhaah there is no issues here, just that sites like epochconverter automatically switch base depending on number of digits of the number. Therefore, in epochconverter,

Using another tool, ie: https://www.epoch101.com/, you'll discover that:

Basically, the even is happening ~22days after you booted up your PC (and that is a validated information from your uptime output.)

The issue here is not about events timestamp, but is all about why your /proc/1 has a very low epoch change time...

FedeDP commented 1 year ago

Can you share also stat /proc/ output?

happy-dude commented 1 year ago

stat /proc/

schan@host:~/libs/build$ stat /proc/
  File: /proc/
  Size: 0               Blocks: 0          IO Block: 1024   directory
Device: 15h/21d Inode: 1           Links: 1657
Access: (0555/dr-xr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 1970-01-01 00:00:12.721000000 +0000
Modify: 1970-01-01 00:00:12.721000000 +0000
Change: 1970-01-01 00:00:12.721000000 +0000
 Birth: -
FedeDP commented 1 year ago

Perhaps you are mounting your proc filesystem with noatime mount option, right? Can you share mount | grep proc?

happy-dude commented 1 year ago

mount | grep proc

schan@host:~/libs/build$ mount | grep proc
proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
Andreagit97 commented 1 year ago

I've updated our test branch removing this proc scan and using the old fallback logic https://github.com/Andreagit97/libs/tree/timestamp_issue. Would you like to give it a try @Happy-Dude?

happy-dude commented 1 year ago

Just built and ran:

schan@host:~/libs/build$ sudo ./libscap/examples/01-open/scap-open --modern_bpf --evt_type 293 --num_events 75000

[SCAP-OPEN]: Hello!

---------------------- SCAP SOURCE ----------------------
* Modern BPF probe, 1 ring buffer every 1 CPUs
-----------------------------------------------------------

--------------------- CONFIGURATIONS ----------------------
* Print single event type: 293 (`-1` means no event to print).
* Run until '75000' events are catched.
-----------------------------------------------------------

---------------------- INTERESTING SYSCALLS ----------------------
* All sc codes are enabled!
------------------------------------------------------------------

************ boot_time: 1677634320260643054

* OK! modern BPF probe correctly loaded: NO VERIFIER ISSUES :)
* Live capture in progress...
* Press CTRL+C to stop the capture

------------------ EVENT: 293 TID:60964
------ HEADER
timestamp: 1679584774031082745
tid: 60964
len: 878
type: 293
num params: 27
------
------ PARAMS
PARAM 0: 0
PARAM 1: /usr/bin/dig
PARAM 2: <omitted>
PARAM 3: 60964
PARAM 4: 60964
PARAM 5: 3437895
PARAM 6: is empty
PARAM 7: 65536
PARAM 8: 0
PARAM 9: 3
PARAM 10: 420
PARAM 11: 4
PARAM 12: 0
PARAM 13: dig
PARAM 14: <omitted>
PARAM 15: <omitted>
PARAM 16: 0
PARAM 17: 60964
PARAM 18: -1
PARAM 19: 0
PARAM 20: 8192
PARAM 21: 8192
PARAM 22: 8192
PARAM 23: 1375
PARAM 24: 11941000000
PARAM 25: 1674660142000000000
PARAM 26: 2138
------
------------------

<omitted events>

---------------------- STATS -----------------------
Events correctly captured (SCAP_SUCCESS): 75000
Seen by driver: 75617
Time elapsed: 0 s
Number of dropped events: 0
Number of timeouts: 303
Number of 'next' calls: 75303
Number of dropped events caused by full buffer (total / all buffer drops - includes all categories below, likely higher than sum of syscall categories): 0
Number of dropped events caused by full buffer (n_drops_buffer_clone_fork_enter syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_clone_fork_exit syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_execve_enter syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_execve_exit syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_connect_enter syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_connect_exit syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_open_enter syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_open_exit syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_dir_file_enter syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_dir_file_exit syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_other_interest_enter syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_other_interest_exit syscall category): 0
Number of dropped events caused by full scratch map: 0
Number of dropped events caused by invalid memory access (page faults): 0
Number of dropped events caused by an invalid condition in the kernel instrumentation (bug): 0
Number of preemptions: 0
Number of events skipped due to the tid being in a set of suppressed tids: 0
Number of threads currently being suppressed: 0
-----------------------------------------------------

[SCAP-OPEN]: Bye!
Andreagit97 commented 1 year ago

ok with this method timestamps have again some sense:

So just to have a summary:

Not sure what we want to do here, this proc anomaly even if strange could happen to other users :/ do we want to always use the fallback method instead of proc one, or maybe we want to check that the value returned from proc is big enough otherwise we will use the fallback :thinking: WDYT ? @incertum @terylt @FedeDP

Andreagit97 commented 1 year ago

on your side @Happy-Dude it would be amazing to understand why in all your machines /proc has this strange change time (~epoch)

FedeDP commented 1 year ago

Yep :( A solution could be to fallback to old method if stat /proc/1 leads to weird results (ie: a result like before year 2000). I think the new approach, that was introduced by Melissa if i recall correctly, is more stable therefore i'd avoid to kill it.

happy-dude commented 1 year ago

on your side @Happy-Dude it would be amazing to understand why in all your machines /proc has this strange change time (~epoch)

Yup yup, will ask around!

For additional info, what was the context/rational for using this method for calculating the boot time? Was the fallback method used until the /proc/1 changetime method was discovered?

happy-dude commented 1 year ago

The /proc/1 timestamp may be something specific to our ARM kernels. We're investigating further 👍

BUT small plot twist: I'm building scap-open on a x64 system to see if I can reproduce the problem, as our kernel does have the right stat /proc/1 output but still exhibit this problem

EDIT: was unable to reproduce with scap-open on our x86-64 kernel (on https://github.com/Andreagit97/libs/commit/da7f52cb17503022cbbc8a2646d7fc6b38e38c9d), but I can confirm that Falco does indeed have the wrong timestamp on that system.

Will work to track down the problem on ARM first and provide context there first.

incertum commented 1 year ago

Hey @Happy-Dude it was changed as it was imprecise between agent reboots as it is also used for some other new time related fields, see https://github.com/falcosecurity/libs/issues/932#issuecomment-1447102203

Now the bigger question is we use this type of approach for our entire initial proc scan in scap, so if this approach is not robust for some environments or settings we need to check if it only concerns the boot time derivation or if the entire proc scan timestamps are off as well, not just for the clone ts, but also the newer ctime and mtime or pidns_init_start_ts fields?

In summary, we can revert to the old fall back method or possibly consider a completely new robust and stable solution, but maybe it won't fix possible proc scan issues and we need a broader adjustment?

Would also be curious to understand what can cause the wrong stat info, please keep us posted. Would you mind also poking around the entire proc folder and checking ctime of pid dirs?

incertum commented 1 year ago

ok with this method timestamps have again some sense:

  • boot_time1677634320260643054 is 2023-03-01 02:32:00.260643072 +0100 CET
  • event_time 1679584774031082745 is 2023-03-23 16:19:34.031082752 +0100 CET

So just to have a summary:

  • The extraction method with stat /proc/1 doesn't work on your machine because for some reason your proc change time is epoch :/
  • The fallback method with clock_gettime correctly works on your machine as expected.

Not sure what we want to do here, this proc anomaly even if strange could happen to other users :/ do we want to always use the fallback method instead of proc one, or maybe we want to check that the value returned from proc is big enough otherwise we will use the fallback 🤔 WDYT ? @incertum @terylt @FedeDP

@Andreagit97 wrt to boot time there are many ways to get it, so we probably could contrast a few options. Preferably would be to retrieve a constant value vs the old fall back method wouldn't give that. If we don't find a good constant approach, I think we have to just revert ...

@terylt @FedeDP @Andreagit97 what are your thoughts around broader implications around the scap proc scan? see comment above https://github.com/falcosecurity/libs/issues/932#issuecomment-1481516184

Andreagit97 commented 1 year ago

@terylt @FedeDP @Andreagit97 what are your thoughts around broader implications around the scap proc scan? see comment above https://github.com/falcosecurity/libs/issues/932#issuecomment-1481516184

Uhm let's say this proc scan code is quite old and nobody ever noticed something, this doesn't mean that it is right but at least should be quite robust, btw your concerns are valid, let's ask to our proc-scan expert @gnosek

happy-dude commented 1 year ago

Hey y'all,

Our internal kernel folks are still investigating the /proc/1 epoch time problem, but they have a suggestion for Falco:

Instead of /proc/1, can /proc/1/cmdline be used instead? This apparently similar to what Docker is doing (something about the procfs / container creation time?).

user@host:~$ strace -v -etrace=statx -y stat /proc/1 /proc/1/cmdline 
statx(AT_FDCWD, "/proc/1", AT_STATX_SYNC_AS_STAT|AT_SYMLINK_NOFOLLOW, STATX_ALL, {stx_mask=STATX_BASIC_STATS|STATX_MNT_ID, stx_blksize=1024, stx_attributes=0, stx_nlink=9, stx_uid=0, stx_gid=0, stx_mode=S_IFDIR|0555, stx_ino=21510, stx_size=0, stx_blocks=0, stx_attributes_mask=STATX_ATTR_AUTOMOUNT|STATX_ATTR_MOUNT_ROOT|STATX_ATTR_DAX, stx_atime={tv_sec=12, tv_nsec=17000000} /* 1970-01-01T00:00:12.017000000+0000 */, stx_btime={tv_sec=0, tv_nsec=0}, stx_ctime={tv_sec=12, tv_nsec=17000000} /* 1970-01-01T00:00:12.017000000+0000 */, stx_mtime={tv_sec=12, tv_nsec=17000000} /* 1970-01-01T00:00:12.017000000+0000 */, stx_rdev_major=0, stx_rdev_minor=0, stx_dev_major=0, stx_dev_minor=21}) = 0
  File: /proc/1
  Size: 0             Blocks: 0          IO Block: 1024   directory
Device: 15h/21d    Inode: 21510       Links: 9
Access: (0555/dr-xr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 1970-01-01 00:00:12.017000000 +0000
Modify: 1970-01-01 00:00:12.017000000 +0000
Change: 1970-01-01 00:00:12.017000000 +0000
 Birth: -

statx(AT_FDCWD, "/proc/1/cmdline", AT_STATX_SYNC_AS_STAT|AT_SYMLINK_NOFOLLOW, STATX_ALL, {stx_mask=STATX_BASIC_STATS|STATX_MNT_ID, stx_blksize=1024, stx_attributes=0, stx_nlink=1, stx_uid=0, stx_gid=0, stx_mode=S_IFREG|0444, stx_ino=70683560, stx_size=0, stx_blocks=0, stx_attributes_mask=STATX_ATTR_AUTOMOUNT|STATX_ATTR_MOUNT_ROOT|STATX_ATTR_DAX, stx_atime={tv_sec=1679590867, tv_nsec=687421205} /* 2023-03-23T17:01:07.687421205+0000 */, stx_btime={tv_sec=0, tv_nsec=0}, stx_ctime={tv_sec=1679590867, tv_nsec=687421205} /* 2023-03-23T17:01:07.687421205+0000 */, stx_mtime={tv_sec=1679590867, tv_nsec=687421205} /* 2023-03-23T17:01:07.687421205+0000 */, stx_rdev_major=0, stx_rdev_minor=0, stx_dev_major=0, stx_dev_minor=21}) = 0
  File: /proc/1/cmdline
  Size: 0             Blocks: 0          IO Block: 1024   regular empty file
Device: 15h/21d    Inode: 70683560    Links: 1
Access: (0444/-r--r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2023-03-23 17:01:07.687421205 +0000
Modify: 2023-03-23 17:01:07.687421205 +0000
Change: 2023-03-23 17:01:07.687421205 +0000
\ Birth: -
+++ exited with 0 +++
incertum commented 1 year ago

ACK, this would certainly be a super simple refactor ...

FedeDP commented 1 year ago

Agree! It is a minor change that seems to fix weird cases like this one! +1 from me for this tiny patch

happy-dude commented 1 year ago

Just created a patch; happy to incorporate any feedback!

I tested it briefly on the ARM system and it looked alright, could use more testing and validation though 🙏

schan@host:~/libs/build$ sudo ./libscap/examples/01-open/scap-open --modern_bpf --evt_type 293 --num_events 75000

[SCAP-OPEN]: Hello!

---------------------- SCAP SOURCE ----------------------
* Modern BPF probe, 1 ring buffer every 1 CPUs
-----------------------------------------------------------

--------------------- CONFIGURATIONS ----------------------
* Print single event type: 293 (`-1` means no event to print).
* Run until '75000' events are catched.
-----------------------------------------------------------

---------------------- INTERESTING SYSCALLS ----------------------
* All sc codes are enabled!
------------------------------------------------------------------

************ proc_cmdline: /proc/1/cmdline
************ proc_cmdline: /proc/1/cmdline
************ boot_time: 1677850636623759387

* OK! modern BPF probe correctly loaded: NO VERIFIER ISSUES :)
* Live capture in progress...
* Press CTRL+C to stop the capture

------------------ EVENT: 293 TID:727523
------ HEADER
timestamp: 1679814140579184804
tid: 727523
len: 816
type: 293
num params: 27
------
------ PARAMS
PARAM 0: 0
PARAM 1: /usr/bin/dig
PARAM 2: <omitted>
PARAM 3: 727523
PARAM 4: 727523
PARAM 5: 3437315
PARAM 6: is empty
PARAM 7: 65536
PARAM 8: 0
PARAM 9: 4
PARAM 10: 420
PARAM 11: 4
PARAM 12: 0
PARAM 13: dig
PARAM 14: <omitted>
PARAM 15: <omitted>
PARAM 16: 0
PARAM 17: 727523
PARAM 18: -1
PARAM 19: 0
PARAM 20: 8192
PARAM 21: 8192
PARAM 22: 8192
PARAM 23: 1375
PARAM 24: 11941000000
PARAM 25: 1674660142000000000
PARAM 26: 2138
------
------------------

---------------------- STATS -----------------------
Events correctly captured (SCAP_SUCCESS): 75000
Seen by driver: 76799
Time elapsed: 0 s
Number of dropped events: 0
Number of timeouts: 265
Number of 'next' calls: 75265
Number of dropped events caused by full buffer (total / all buffer drops - includes all categories below, likely higher than sum of syscall categories): 0
Number of dropped events caused by full buffer (n_drops_buffer_clone_fork_enter syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_clone_fork_exit syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_execve_enter syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_execve_exit syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_connect_enter syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_connect_exit syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_open_enter syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_open_exit syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_dir_file_enter syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_dir_file_exit syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_other_interest_enter syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_other_interest_exit syscall category): 0
Number of dropped events caused by full scratch map: 0
Number of dropped events caused by invalid memory access (page faults): 0
Number of dropped events caused by an invalid condition in the kernel instrumentation (bug): 0
Number of preemptions: 0
Number of events skipped due to the tid being in a set of suppressed tids: 0
Number of threads currently being suppressed: 0
-----------------------------------------------------

[SCAP-OPEN]: Bye!

schan@host:~/libs/build$ git diff
diff --git a/userspace/libscap/scap.c b/userspace/libscap/scap.c
index 324c0639..58073079 100644
--- a/userspace/libscap/scap.c
+++ b/userspace/libscap/scap.c
@@ -1230,13 +1230,13 @@ int32_t scap_get_boot_time(char* last_err, uint64_t *boot_time)
        struct timespec tv_now = {0};
        uint64_t now = 0;
        uint64_t uptime = 0;
-       char proc_dir[PPM_MAX_PATH_SIZE];
+       char proc_cmdline[PPM_MAX_PATH_SIZE];
        struct stat targetstat = {0};

        /* More reliable way to get boot time */
-       snprintf(proc_dir, sizeof(proc_dir), "%s/proc/1/", scap_get_host_root());
-       printf("************ proc_dir: %s\n", proc_dir);
-       if (stat(proc_dir, &targetstat) == 0)
+       snprintf(proc_cmdline, sizeof(proc_cmdline), "%s/proc/1/cmdline", scap_get_host_root());
+       printf("************ proc_cmdline: %s\n", proc_cmdline);
+       if (stat(proc_cmdline, &targetstat) == 0)
        {
                /* This approach is constant between agent re-boots */
                *boot_time = targetstat.st_ctim.tv_sec * (uint64_t) SECOND_TO_NS + targetstat.st_ctim.tv_nsec;
happy-dude commented 1 year ago

Unfortunate news:

The change didn't fix the timestamp on my nodes; it's possible that /proc/1/cmdline change may be insufficient in producing the right timestamp -- confirmed on both a x86-64 host and aarch64 host.

Changed the PR to (WIP)

Internally to my org, there may be separate issues affecting /proc mutually exclusive to x86-64 and aarch64 -- I'll investigate further on that front with internal engineers.

EDIT: I'm also discussing with my team on if we can patch our builds to use the fall back method w/ clock_gettime() instead 🤔

Here is some output from scap_open:

schan@host:~/libs/build$ date -u && date +%s && date +%s%N && uptime && stat /proc/1/cmdline; sudo ./libscap/examples/01-open/scap-open --modern_bpf --evt_type execve --num_events 10

Thu 23 Mar 2023 11:00:42 PM UTC

1679612442

1679612442422449367

 23:00:42 up 56 min,  1 user,  load average: 1.58, 1.63, 1.70

  File: /proc/1/cmdline
  Size: 0               Blocks: 0          IO Block: 1024   regular empty file
Device: 15h/21d Inode: 29705       Links: 1
Access: (0444/-r--r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2023-02-15 20:36:33.542000000 +0000
Modify: 2023-02-15 20:36:33.542000000 +0000
Change: 2023-02-15 20:36:33.542000000 +0000
 Birth: -

[SCAP-OPEN]: Hello!

---------------------- SCAP SOURCE ----------------------
* Modern BPF probe, 1 ring buffer every 1 CPUs
-----------------------------------------------------------

--------------------- CONFIGURATIONS ----------------------
* Print single event type: 0 (`-1` means no event to print).
* Run until '10' events are catched.
-----------------------------------------------------------

---------------------- INTERESTING SYSCALLS ----------------------
* All sc codes are enabled!
------------------------------------------------------------------

************ proc_cmdline: /proc/1/cmdline
************ proc_cmdline: /proc/1/cmdline
************ boot_time: 1676493393542000000

* OK! modern BPF probe correctly loaded: NO VERIFIER ISSUES :)
* Live capture in progress...
* Press CTRL+C to stop the capture

------------------ EVENT: 0 TID:73151
------ HEADER
timestamp: 1676496781579307218
tid: 73151
len: 34
type: 0
num params: 2
------
------ PARAMS
PARAM 0: 176
PARAM 1: 113
------
------------------

---------------------- STATS -----------------------
Events correctly captured (SCAP_SUCCESS): 10
Seen by driver: 247
Time elapsed: 0 s
Number of dropped events: 0
Number of timeouts: 1
Number of 'next' calls: 11
Number of dropped events caused by full buffer (total / all buffer drops - includes all categories below, likely higher than sum of syscall categories): 0
Number of dropped events caused by full buffer (n_drops_buffer_clone_fork_enter syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_clone_fork_exit syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_execve_enter syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_execve_exit syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_connect_enter syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_connect_exit syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_open_enter syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_open_exit syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_dir_file_enter syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_dir_file_exit syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_other_interest_enter syscall category): 0
Number of dropped events caused by full buffer (n_drops_buffer_other_interest_exit syscall category): 0
Number of dropped events caused by full scratch map: 0
Number of dropped events caused by invalid memory access (page faults): 0
Number of dropped events caused by an invalid condition in the kernel instrumentation (bug): 0
Number of preemptions: 0
Number of events skipped due to the tid being in a set of suppressed tids: 0
Number of threads currently being suppressed: 0
-----------------------------------------------------

[SCAP-OPEN]: Bye!
incertum commented 1 year ago

Yes I think that would be a great data point for us to see if the old approach works.

In addition, surfing the internet for any options available e.g. see this, maybe it would be helpful to check on all of them just to give all of us more information to work with and help.

Anyone else having more ideas?

Also would it perhaps be possible to reboot one server and check again on all options?

happy-dude commented 1 year ago

Also would it perhaps be possible to reboot one server and check again on all options?

Possible depending on which option; essentially have to make new packages iterating through each because config management will install the latest package in our repo on my evaluation hosts 😅

Yes I think that would be a great data point for us to see if the old approach works.

I'll look to creating a package that comments out the boot time code in scap.c like in https://github.com/falcosecurity/libs/compare/master...Andreagit97:libs:timestamp_issue

Does that mean I also need to undo the /proc/1/cmdline changes made elsewhere?

incertum commented 1 year ago

Maybe just run the different commands from a shell first to check?

Does that mean I also need to undo the /proc/1/cmdline changes made elsewhere?

Let's just go with Andrea's suggestion to see the old approach, plus would be curious about the outputs of the bash commands in that post to know if any of the other options gives the correct boot ts.

Andreagit97 commented 1 year ago

The change didn't fix the timestamp on my nodes; it's possible that /proc/1/cmdline change may be insufficient in producing the right timestamp -- confirmed on both a x86-64 host and aarch64 host.

What do you mean by "The change didn't fix the timestamp on my nodes"? if I understand correctly now using /proc/1/cmdline you should see the right boot_time, is it true? if yes, what is the issue now :thinking: ?

happy-dude commented 1 year ago

What do you mean by "The change didn't fix the timestamp on my nodes"? if I understand correctly now using /proc/1/cmdline you should see the right boot_time, is it true? if yes, what is the issue now 🤔 ?

Apologies for lack of context; with the output from https://github.com/falcosecurity/libs/issues/932#issuecomment-1482026428, at the time this was a system I rebooted ~1 hour before -- unfortunately, /proc/1/cmdline reported that it was created + changed on Feb 15, which is definitely not the case.

This makes our assumption that /proc/1/cmdline having the right boot_time incorrect 😓

We're discussing this a bit further in the Slack on our other options (I added you to the group!), and perhaps something else in procfs may prove useful!

happy-dude commented 1 year ago

Just updated the PR and added a comment, https://github.com/falcosecurity/libs/pull/1003#issuecomment-1483538662

Removed the (WIP) tag and submitted a new commit (and rebases with fixes to suggestions).

I tested a build with these two commits applied as a patch and first impressions are very positive; Falco is logging events with the right timestamps!

Summary:

After chatting in #932 and a Slack thread, we discovered that /proc/1/cmdline did not return the right boot time as we had thought. This was verified on a host in my org.

This lead us to do a round of debugging involving running the following commands on problematic systems:

* `uptime -s`, `who -b`, `stat /proc /proc/1 /proc/1/cmdline`, `cat /proc/stat | grep btime`: revealed that the output from who -b was incorrect, but the output from uptime was correct

* `ls -lahtr`: revealed that `/proc/uptime` had timestamps closer to when the system was started

Most importantly, the cat /proc/stat | grep btime had a btime value closest to the boot time. Speaking to a few engineers in my org also revealed that the same parts of the kernel inform/feed values into /proc/utime and /proc/stat's btime, so they are "essentially" equivalent.

The Slack thread also discussed ways of getting the system boot time:

* btime in /proc/stat

* calculation via clock_gettime(CLOCK_REALTIME - CLOCK_BOOTTIME)

* calculation via time(NULL) - sysinfo().uptime

Overall, the maintainers preferred btime from /proc/stat for the following reasons:

* value does not depend on calculation using current timestamp

* btime is "static" and doesn't change once set

* btime is available in kernels from 2008

* CLOCK_BOOTTIME is available in kernels from 2011 (2.6.38)

@gnosek mentioned:

By scraping btime from /proc/stat, it is both the heaviest and most likely to succeed

And this PR (as it is right now) is a result of that discussion:

* use `btime` from `/proc/stat` for boot time

* use `/proc/<pid>/root/proc/1/cmdline` for container start time