Closed LucaGuerra closed 1 year ago
I thought about how to improve our documentation in this regard. I have a general idea of what to document to be useful for adopters and users, based on what I would like to know as a power user or contributor.
In libscap we have a concept of "scap engines" (or scap_vtable) which is a common interface that all syscall sources use (meaning: kernel module, eBPF, modern eBPF, gVisor ...). For a contributor, I think it would be very useful to understand how this mechanism works because they may want to implement more ways to collect data from a running kernel or understand the existing ones. As an example, a microVM expert might know how to efficiently get syscall data out of those lightweight systems, and if they wish to contribute to Falco they will have documentation that explains how to do so. For a power user, this would serve as a guide to take a look at exactly what happens when the driver is initialized and/or when it starts collecting events because they all go through the same interface.
On the other hand, I thought about documenting the actual low level communication between the kernel and userspace (ioctls and maps) and I couldn't find a real use for it. It'd be very hard to keep updated and maintained for all the syscall sources. In addition, the drivers and engines aren't really designed to work standalone and are working right as a part of libscap even if in two cases (classic ebpf and kmod) they are distributed as a separate file for convenience. However, users should be able to find documentation about what the version numbers mean because they are output by Falco and most likely any other tool that uses the library, and also identify where the actual boundary between user and kernel is located in Falco.
Also, our little tool scap_open
could be very interesting for contributors and adopters because:
I was able to discuss the matter with @leogr .
The concept of scap engine (scap_vtable) is not a public API but rather something that we're still evolving. While it is documented in the code, there is a high risk of that documentation becoming stale quickly, and we don't want to mislead our adopters and contributors.
On the other hand, we still want to make it clearer what kind of data is exchanged between the kernel and userspace. To respond to users' and adopters' needs, we identified the following gaps in the documentation, considering we already document the list of events https://falco.org/docs/reference/rules/supported-events/ .
I am working on this, and the right place for this info is the website so I transferred the issue there.
What to document
The Falco review from the CNCF TOC highlighted the need to better document how the data flows between user space and kernel space. See Emily's comment:
Essentially, some parts of the protocol are implementation details, while some other parts can help adopters and contributors in understanding what kind of information is exchanged. I feel like we already have some of this information in our public reference but our adopters are very technical and need to learn more in order to make informed decisions, while allowing a less steep learning curve for contributors willing to extend the system.