iovisor / bcc

BCC - Tools for BPF-based Linux IO analysis, networking, monitoring, and more
Apache License 2.0
20.46k stars 3.86k forks source link

"Productionizing" BPF tools #421

Open goldshtn opened 8 years ago

goldshtn commented 8 years ago

To make it easier to use the tools in production, would it be possible to precompile the BPF program and produce something that doesn't have a dependency on clang and ideally on Python either?

4ast commented 8 years ago

compile .c into .o and use elf loader. That's how samples/bpf/ are written and it's the approach taken by perf+bpf. But it's more difficult in production when servers have different versions of the kernel and pre-compiling of programs that walk kernel data structures is not possible.

drzaeus77 commented 8 years ago

Definitely agree. The use cases for bcc and perf are different, and I think the different approaches are both fine. If compiling statically/off-box, it needs to happen in the kernel tree, and be versioned as a binary along with the kernel as it moves. Effectively it is a .ko. There is some work to be done to make bcc more portable, and @4ast has been bugging me to make some changes for his use case, but to work with kernel headers as the moving target a C compiler needs to be involved somehow.

Let me ask, what is the particular problem with the clang toolchain that we should focus on addressing or eliminating?

  1. Size of libbcc.so
  2. Existence of libbcc.so and not libbcc.a
  3. Dependency on /lib/modules/$(uname -r)/build
  4. "Security concerns" (quotes because everyone defines security differently, please indicate your definition)
  5. ...

I think all of the above have been raised before, but I don't have a sense for which ones are perceived negatives only and which are real showstoppers. It would be good to get some feedback on those.

goldshtn commented 8 years ago

The reason I'm bringing this up is not because of some size or security issue with the build tool chain. It's purely a matter of convenience. I wouldn't want clang and the BPF Python module as a dependency I need to put in my production environment. It is very likely that all servers are running the same kernel version, so I would expect to be able to compile a tool once on my dev box that has the same kernel version, and then use it in binary form on all production servers that have the same kernel version.

I believe SystemTap has a similar feature where they compile a .ko that you can put on production servers that don't have kernel headers, sources, or debug info.

drzaeus77 commented 8 years ago

This is achievable. Keep in mind that there is already no dependency on clang being on the target machine! It is statically linked inside libbcc.so (see ldd libbcc.so). It should be possible to create a libbcc.a that can also be statically linked itself.

Similarly, for an individual tool, there should be ways to package up the python dependency in a portable way. For instance, see https://docs.python.org/2/whatsnew/2.6.html#other-language-changes, notably the executable 'Directories and zip archives'. Alternatively, any tool could be built directly against the C/C++ api and built as its own binary as opposed to python.

For kernel headers, I'm thinking at runtime that there could be 3 tries:

  1. look for (or provide as an argument) statically packaged precompiled headers (.pch) for the running kernel
  2. look for (or provide as an argument) .pch in a well-known location
  3. look for regular headers in the standard location

I know this is going in a direction different than you suggested, but does it address some of your concerns? Does it still feel wrong?

goldshtn commented 8 years ago

Oh, I hadn't realized clang wasn't required at runtime. blush Thats already a great deal better than I thought.

Re libbcc.so and the Python dependencies, I suppose everything can be placed in a single directory, which is reasonable.

But still the reliance on kernel headers is something I would rather avoid. Does the BPF Python module offer a way to precompile and then load from an .o the BPF program?

drzaeus77 commented 8 years ago

No .o output format, again that gets back to @4ast 's original comment. The approach is somewhat incompatible with bcc.

The technical reason is that the file descriptor results from the bpf() syscall when opening maps need to be kept in process and converted to a literal value in the JITed code, which happens before finalizing the LLVM IR.

In addition, some of the python features depend on knowledge of the BPF_TABLE key/value struct layouts (when they're not simple POD types), and that is currently parsed by the clang rewriter in b_frontend_action.cc. Extracting the same info from a .o would probably need work from someone who knows a bit about dwarf formats, which I haven't touched yet.

Of course, in software anything is possible, so I can't say that we'd never support such features on .o files, but the effort needs to be justified.

I think the PCH approach would achieve what you're looking for, but feel free to remain skeptical until we have a prototype and you can test it.

goldshtn commented 8 years ago

Sure, I understand. Just to make sure - all I would need to run a BPF script on a production system are the following:

1) The BPF Python module 2) libbcc.so 3) Kernel headers (but no sources or debug info)

Am I missing something?

drzaeus77 commented 8 years ago

What you say is exactly the case today.

goldshtn commented 8 years ago

Thanks! Do you have any timeline for the pch support you mentioned? :-)

alban commented 7 years ago

I am interested in this as well.

Size of libbcc.so

I tried to include bcc in a Docker image to be used by a Go application. The image size grew from 82 MB to 382 MB (uncompressed, as reported by the "docker images" command). Part of that is due to the size of libbcc.so (37 MB), but also to switching to an Ubuntu-based Docker image (FROM zlim/bcc in the Dockerfile) instead of the previously used smaller image alpine:3.3. It would be nice to have builds of bcc in Alpine Linux, that would help with the size. I reported alpine#6426 for that, but it is not trivial because elfutils and maybe others are not packaged in Alpine (alpine#3610). Meanwhile, @iaguis tried to flatten the Docker image and remove some files (dpkg cache...) and got a 202 MB, that is a bit better.

Dependency on /lib/modules/$(uname -r)/build

I'm also interested about that, either through pre-compiled bpf bytecode (#534) or with the PCH approach.

/cc @2opremio @iaguis

justincormack commented 7 years ago

@alban I have now built bcc for Alpine, and will work on upstreaming it, and also building a slimmer Docker container with just what is needed at runtime to deploy a bcc program. Ping me if you want more details (the Weave people have my contact details).

gmile commented 6 years ago

@justincormack have you ever proceeded with upstreaming bcc for Alpine? If so, can you please point to the work you've done?

gmile commented 6 years ago

I'm trying to compile bcc under Alpine in docker, this is my Dockerfile:

FROM alpine:3.7

RUN apk add --update git clang llvm cmake flex bison luajit build-base

RUN git clone https://github.com/iovisor/bcc.git

WORKDIR /bcc/build

I have the following versions of packages:

package versions ``` (1/46) Installing m4 (1.4.18-r0) (2/46) Installing bison (3.0.4-r0) (3/46) Installing binutils-libs (2.28-r3) (4/46) Installing binutils (2.28-r3) (5/46) Installing gmp (6.1.2-r1) (6/46) Installing isl (0.18-r0) (7/46) Installing libgomp (6.4.0-r5) (8/46) Installing libatomic (6.4.0-r5) (9/46) Installing pkgconf (1.3.10-r0) (10/46) Installing libgcc (6.4.0-r5) (11/46) Installing mpfr3 (3.1.5-r1) (12/46) Installing mpc1 (1.0.3-r1) (13/46) Installing libstdc++ (6.4.0-r5) (14/46) Installing gcc (6.4.0-r5) (15/46) Installing musl-dev (1.1.18-r2) (16/46) Installing libc-dev (0.7.1-r0) (17/46) Installing g++ (6.4.0-r5) (18/46) Installing make (4.2.1-r0) (19/46) Installing fortify-headers (0.9-r0) (20/46) Installing build-base (0.5-r0) (21/46) Installing clang-libs (5.0.0-r0) (22/46) Installing libxml2 (2.9.7-r0) (23/46) Installing clang (5.0.0-r0) (24/46) Installing libattr (2.4.47-r6) (25/46) Installing libacl (2.2.52-r3) (26/46) Installing libbz2 (1.0.6-r6) (27/46) Installing lz4-libs (1.8.0-r1) (28/46) Installing xz-libs (5.2.3-r1) (29/46) Installing libarchive (3.3.2-r2) (30/46) Installing ca-certificates (20171114-r0) (31/46) Installing libssh2 (1.8.0-r2) (32/46) Installing libcurl (7.57.0-r0) (33/46) Installing expat (2.2.5-r0) (34/46) Installing ncurses-terminfo-base (6.0_p20171125-r0) (35/46) Installing ncurses-terminfo (6.0_p20171125-r0) (36/46) Installing ncurses-libs (6.0_p20171125-r0) (37/46) Installing rhash-libs (1.3.5-r1) (38/46) Installing libuv (1.17.0-r0) (39/46) Installing cmake (3.9.5-r0) (40/46) Installing flex (2.6.4-r1) (41/46) Installing pcre2 (10.30-r0) (42/46) Installing git (2.15.0-r1) (43/46) Installing libffi (3.2.1-r4) (44/46) Installing llvm5-libs (5.0.0-r0) (45/46) Installing llvm5 (5.0.0-r0) (46/46) Installing luajit (2.1.0_beta3-r0) ```

But the compilation process fails:

# cmake .. -DCMAKE_INSTALL_PREFIX=/usr
-- The C compiler identification is GNU 6.4.0
-- The CXX compiler identification is GNU 6.4.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Latest recognized Git tag is v0.5.0
-- Git HEAD is 3f39bc14457261c87e821983fe154ec2fe6f116b
-- Revision is 0.5.0-3f39bc14
-- Performing Test HAVE_NO_PIE_FLAG
-- Performing Test HAVE_NO_PIE_FLAG - Success
-- Found BISON: /usr/bin/bison (found version "3.0.4")
-- Found FLEX: /usr/bin/flex (found version "2.6.4")
CMake Error at CMakeLists.txt:28 (find_package):
  Could not find a package configuration file provided by "LLVM" with any of
  the following names:

    LLVMConfig.cmake
    llvm-config.cmake

  Add the installation prefix of "LLVM" to CMAKE_PREFIX_PATH or set
  "LLVM_DIR" to a directory containing one of the above files.  If "LLVM"
  provides a separate development package or SDK, be sure it has been
  installed.

-- Configuring incomplete, errors occurred!
See also "/bcc/build/CMakeFiles/CMakeOutput.log".

Any idea how to get over the Could not find a package configuration file provided by "LLVM" problem?

yonghong-song commented 6 years ago

See INSTALL.md, you need llvm/clang devel packages (on Fedora systems for below)

sudo dnf install -y clang clang-devel llvm llvm-devel llvm-static ncurses-devel
gmile commented 6 years ago

@yonghong-song thank you, I was able to proceed and fix the error I mentioned. Now I'm facing a different one, this time from actually compiling:

[ 24%] Building CXX object src/cc/api/CMakeFiles/api-static.dir/BPF.cc.o
In file included from /bcc/src/cc/usdt.h:23:0,
                 from /bcc/src/cc/api/BPF.cc:35:
/bcc/src/cc/ns_guard.h:35:3: error: 'ino_t' does not name a type
   ino_t target_ino() const { return target_ino_; }
   ^~~~~
/bcc/src/cc/ns_guard.h:40:3: error: 'ino_t' does not name a type
   ino_t target_ino_;
   ^~~~~
In file included from /bcc/src/cc/api/BPF.cc:35:0:
/bcc/src/cc/usdt.h: In member function 'ino_t USDT::Context::inode() const':
/bcc/src/cc/usdt.h:259:52: error: 'class ProcMountNS' has no member named 'target_ino'; did you mean 'target_fd_'?
   ino_t inode() const { return mount_ns_instance_->target_ino(); }
                                                    ^~~~~~~~~~
make[2]: *** [src/cc/api/CMakeFiles/api-static.dir/build.make:63: src/cc/api/CMakeFiles/api-static.dir/BPF.cc.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:334: src/cc/api/CMakeFiles/api-static.dir/all] Error 2
make: *** [Makefile:141: all] Error 2

Any idea how to overcome this one?

yonghong-song commented 6 years ago

The ino_t should be defined at /usr/include/sys/types.h. Maybe your system is older and does not have a recent /usr/include?

gmile commented 6 years ago

@yonghong-song running cat /usr/include/sys/types.h | grep ino_t reveals the following:

#define __NEED_ino_t
#define ino64_t ino_t

grep output is the same on 4.4 and 4.13 version of kernel.

gmile commented 6 years ago

@yonghong-song I was able to fix this by adding #include <sys/types.h> to src/cc/ns_guard.h.

gmile commented 6 years ago

So I was able to build & install bcc on alpine, however I'm facing some difficulties running the tools, which seems to be alpine-specific.

I'll put my finding in https://bugs.alpinelinux.org/issues/6426, mentioned by @alban.

P.S. The only other issue I had with building bcc compilation was RTLD_DI_ORIGIN thing in tests/cc/test_c_api.cc. Somehow RTLD_DI_ORIGIN is not defined in /usr/include/dlfcn.h in Alpine. I commented out that section to fix the problem:

diff --git a/tests/cc/test_c_api.cc b/tests/cc/test_c_api.cc
index 7e5859f..0511448 100644
--- a/tests/cc/test_c_api.cc
+++ b/tests/cc/test_c_api.cc
@@ -149,10 +149,10 @@ static int mntns_func(void *arg) {
     return -1;
   }

-  if (dlinfo(dlhdl, RTLD_DI_ORIGIN, &libpath) < 0) {
-    fprintf(stderr, "Unable to find origin of libz.so.1: %s\n", dlerror());
-    return -1;
-  }
+  // if (dlinfo(dlhdl, RTLD_DI_ORIGIN, &libpath) < 0) {
+  //  fprintf(stderr, "Unable to find origin of libz.so.1: %s\n", dlerror());
+  //  return -1;
+  // }

   dlclose(dlhdl);
   dlhdl = NULL;
ajor commented 6 years ago

I'm also hitting the ino_t issue in alpine - would it be possible to get your patches merged into BCC @gmile ?

gmile commented 6 years ago

@ajor opened a PR here – https://github.com/iovisor/bcc/pull/1598