cilium / tetragon

eBPF-based Security Observability and Runtime Enforcement
https://tetragon.io
Apache License 2.0
3.42k stars 326 forks source link

Tetragon based file integrity monitoring (FIM) #2409

Open anfedotoff opened 2 months ago

anfedotoff commented 2 months ago

Is there an existing issue for this?

Is your feature request related to a problem?

No response

Describe the feature you would like

We could use Tetragon for file integrity monitoring: collect hashes of executed binaries and opened files and put this information in events. Hashes are calculated using IMA-measurement Linux integrity subsystem.

Describe your proposed solution

We already talked about FIM. I found some technical issues during my research, so I decided to provide a CFP before PR.

Code of Conduct

anfedotoff commented 2 months ago

Hi :wave: , @kkourt! If you have time, please, have a look. I'll be happy to have some discussion on implementation details.

xmulligan commented 2 months ago

Once it is ready, please also add the CfP to the repo https://github.com/cilium/design-cfps

kkourt commented 1 month ago

Thanks @anfedotoff!

Here are some first thoughts:

Considering your proposal:

spec:
  lsm:
  - call: "bprm_check_security"
    args:
    - index: 0
      type: "linux_binprm" # file type also is allowed
    selectors:
      - matchArgs:
          - index: 0
            operator: "Prefix"
            values:
              - "/usr/bin"
      - matchActions:
          - action: FileHash
            argHash 0

In the BPF code, what we do is:

For the linux_binprm type, we first copy the path: https://github.com/cilium/tetragon/blob/82c4b1379481e6c0a1b27f9835133c9ce2ed32f9/bpf/process/types/basic.h#L2591-L2598

We then filter: https://github.com/cilium/tetragon/blob/82c4b1379481e6c0a1b27f9835133c9ce2ed32f9/bpf/process/types/basic.h#L1815-L1820

And finally do the action: https://github.com/cilium/tetragon/blob/82c4b1379481e6c0a1b27f9835133c9ce2ed32f9/bpf/process/types/basic.h#L2358

So by the time we reach the action, we only have the string and we cannot get the hash. Hence, I believe we need to get the hash at the first step.

So I was thinking something like:

spec:
  lsm:
  - call: "bprm_check_security"
    args:
    - index: 0
      type: "linux_binprm" # file type also is allowed
    - index: 1 # argument 1 will be the result of applying operation ima_file_hash() to argument index 0
      type: "hash"
      sourceIndex: 0
      operator: "ima_file_hash"
    selectors:
      - matchArgs:
          - index: 0
            operator: "Prefix"
            values:
              - "/usr/bin"

I'm still not sure about the syntax, but the basic idea would be to push the computation of the hash early, when we extract the arguments.

anfedotoff commented 1 month ago

LGTM! We still able to filter by file path, before collecting a hash in your approach, right? In other words I mean not to call ima bpf-helpers if filtering is not passed.

As far as I concerned, IMA bpf-helpers just retrieve the hash from IMA-measurement list. Difference between bpf_ima_inode_hash (5.15) and bpf_ima_file_hash (5.18): if there is no hash in IMA-measurement list bpf_ima_file_hash will calculate the hash, update IMA-measurement list and return it to the caller.

operator: "ima_file_hash"

Here you mean to call appropriate bpf-helper according to kernel version? Or user specifies the helper it prefers? I think, the first way is better.

kkourt commented 1 month ago

LGTM! We still able to filter by file path, before collecting a hash in your approach, right? In other words I mean not to call ima bpf-helpers if filtering is not passed.

I think it should be possible to collect the hash after the filtering, but it's more tricky. In that case, collecting the hash in the action makes more sense to me, but we will need to maintain the necessary arguments to call the helpers.

As far as I concerned, IMA bpf-helpers just retrieve the hash from IMA-measurement list. Difference between bpf_ima_inode_hash (5.15) and bpf_ima_file_hash (5.18): if there is no hash in IMA-measurement list bpf_ima_file_hash will calculate the hash, update IMA-measurement list and return it to the caller.

operator: "ima_file_hash"

Here you mean to call appropriate bpf-helper according to kernel version? Or user specifies the helper it prefers? I think, the first way is better.

I would do the simple thing first, allowing users to specify exactly what they want. We can add a detection function to reject the policy if the helper does not exist.

anfedotoff commented 1 month ago

I think it should be possible to collect the hash after the filtering, but it's more tricky. In that case, collecting the hash in the action makes more sense to me, but we will need to maintain the necessary arguments to call the helpers.

Ah, I understood. Before args filtering, we need to retrieve all arguments. I think we can try to implement your approach. To get hash using an action, we need to store arguments for bpf-helpers somewhere (suppose in separate bpf-map). So, for now, using actions looks more complicated for me:)).

I would do the simple thing first, allowing users to specify exactly what they want. We can add a detection function to reject the policy if the helper does not exist.

It makes sense. I'll take time to learn more about how to validate tracing policy for correctness.

kkourt commented 1 month ago

It makes sense. I'll take time to learn more about how to validate tracing policy for correctness.

Here's an example of checking whether the "multi kprobe" feature is supported: https://github.com/cilium/tetragon/blob/e7c9ec3533cd8faf2ce1b2c38aaabd70402ad930/pkg/bpf/detect.go#L45

What we can do then is check to see whether a specific feature is supported iff it's used by a tracing policy. See for example: https://github.com/cilium/tetragon/blob/e7c9ec3533cd8faf2ce1b2c38aaabd70402ad930/pkg/sensors/tracing/enforcer.go#L283.