falcosecurity / falco

Cloud Native Runtime Security
https://falco.org
Apache License 2.0
7.39k stars 902 forks source link

Falco 0.37.1 crashes with "could not parse param 13 (comm) for event of type 223 (clone): expected length 10, found 14" #3275

Open kanishk7bantu opened 4 months ago

kanishk7bantu commented 4 months ago

Describe the bug

Error: could not parse param 13 (comm) for event 10224646 of type 223 (clone): expected length 10, found 14

How to reproduce it

It will happen by periodically automatic in pod.

Expected behaviour

Screenshots

Fri Jul  5 10:22:14 2024: /etc/falco/rules.d/rules-muon-privileged-containers.yaml: Ok, with warnings
1 Warnings:
In rules content: (/etc/falco/rules.d/rules-muon-privileged-containers.yaml:0:0)
    macro 'user_privileged_containers': (/etc/falco/rules.d/rules-muon-privileged-containers.yaml:0:2)
------
- macro: user_privileged_containers
  ^
------
LOAD_UNUSED_MACRO (Unused macro): Macro not referred to by any other rule/macro
Fri Jul  5 10:22:14 2024: Loading rules from file /etc/falco/rules.d/rules-muon-sensitive-mounts.yaml
Fri Jul  5 10:22:15 2024: /etc/falco/rules.d/rules-muon-sensitive-mounts.yaml: Ok, with warnings
2 Warnings:
In rules content: (/etc/falco/rules.d/rules-muon-privileged-containers.yaml:0:0)
    macro 'user_privileged_containers': (/etc/falco/rules.d/rules-muon-privileged-containers.yaml:0:2)
------
- macro: user_privileged_containers
  ^
------
LOAD_UNUSED_MACRO (Unused macro): Macro not referred to by any other rule/macro
In rules content: (/etc/falco/rules.d/rules-muon-sensitive-mounts.yaml:0:0)
    macro 'user_sensitive_mount_containers': (/etc/falco/rules.d/rules-muon-sensitive-mounts.yaml:0:2)
------
- macro: user_sensitive_mount_containers
  ^
------
LOAD_UNUSED_MACRO (Unused macro): Macro not referred to by any other rule/macro
Fri Jul  5 10:22:15 2024: Hostname value has been overridden via environment variable to: den1-blue-md-0-5e0e300a-8s495
Fri Jul  5 10:22:15 2024: The chosen syscall buffer dimension is: 8388608 bytes (8 MBs)
Fri Jul  5 10:22:15 2024: Loaded event sources: syscall
Fri Jul  5 10:22:15 2024: Enabled event sources: syscall
Fri Jul  5 10:22:15 2024: Opening 'syscall' source with Kernel module

Events detected: 1
Rule counts by severity:
   NOTICE: 1
Triggered rules by rule name:
   Contact K8S API Server From Container: 1
Error: could not parse param 13 (comm) for event 10224646 of type 223 (clone): expected length 10, found 14

Environment

Additional context

NA

Andreagit97 commented 4 months ago

I've tried to reproduce it without success...Looking at the code it seems like we catch from the kernel a 14 bytes charbuf, but when we use strnlen on it we obtain 10, so it seems like there is a terminator \0 in the middle... I simulated it with prctl

TEST(SyscallExit, clone3X_father)
{
    auto evt_test = get_syscall_event_test(__NR_clone3, EXIT_EVENT);

    evt_test->enable_capture();

    /*=============================== TRIGGER SYSCALL  ===========================*/

    int option = PR_SET_NAME;
    const char arg2[] = "truncated\0comm";
    unsigned long arg3 = 0;
    unsigned long arg4 = 0;
    unsigned long arg5 = 0;

    if(syscall(__NR_prctl, option, arg2, arg3, arg4, arg5) != 0)
    {
        FAIL();
    }

    /* We need to use `SIGCHLD` otherwise the parent won't receive any signal
     * when the child terminates. We use `CLONE_FILES` just to test the flags.
     */
    clone_args cl_args = {};
    cl_args.flags = CLONE_FILES;
    cl_args.exit_signal = SIGCHLD;
    pid_t ret_pid = syscall(__NR_clone3, &cl_args, sizeof(cl_args));

    if(ret_pid == 0)
    {
        /* Child terminates immediately. */
        exit(EXIT_SUCCESS);
    }

    assert_syscall_state(SYSCALL_SUCCESS, "clone3", ret_pid, NOT_EQUAL, -1);
    /* Catch the child before doing anything else. */
    int status = 0;
    int options = 0;
    assert_syscall_state(SYSCALL_SUCCESS, "wait4", syscall(__NR_wait4, ret_pid, &status, options, NULL), NOT_EQUAL,
                 -1);

    /*=============================== TRIGGER SYSCALL  ===========================*/

    evt_test->disable_capture();

    evt_test->assert_event_presence();

    if(HasFatalFailure())
    {
        return;
    }

    evt_test->parse_event();

    evt_test->assert_header();

    /*=============================== ASSERT PARAMETERS  ===========================*/

    /* Parameter 14: comm (type: PT_CHARBUF) */
    evt_test->assert_charbuf_param(14, "truncated");

    /*=============================== ASSERT PARAMETERS  ===========================*/

    evt_test->assert_num_params_pushed(21);
}

but the kernel module correctly returns the "truncated" string :thinking:

one thing that we could do it to add a log before the exception to print the bytes and see what is going on... i don't love it so before going for it i would like to hear if we have other ideas

LucaGuerra commented 4 months ago

@Andreagit97 I also don't love it but I also could not reproduce this behavior, I think some debug log is in order, see my libs PR above. Also, we need to make sure that those debug logs are also printed in Falco when enabled. The severity level is currently set to DEBUG because the event data may be sensitive and I don't think we want to print it by default.

LucaGuerra commented 2 months ago

Falco 0.38.2, released today, includes more logs for this kind of issues. In order to display them you have to enable libs_logger (by setting enabled: true https://github.com/falcosecurity/falco/blob/master/falco.yaml#L806 ) in your falco.yaml . When that error occurs Falco will log additional information about which process and content triggered it. Hopefully we can better identify the root cause!