iovisor / bcc

BCC - Tools for BPF-based Linux IO analysis, networking, monitoring, and more
Apache License 2.0
20.46k stars 3.86k forks source link

support load objects #2564

Open yonghong-song opened 5 years ago

yonghong-song commented 5 years ago

This issue is created here to start the discussion to add support of loading objects, instead of source codes, in bcc. The reason is mostly for embedded systems where resources at runtime are constrained. But this may be also useful for servers where you do not want bcc jit compilation impacts your normal workloads.

The current bcc compilation and program loading steps:

  1. given a C code, rewriter to transform kernel data structure access as bpf_probe_read, and convert some bcc specific constructs like map.lookup(&key) to proper bpf syscall bpf_map_lookup_element(pseudo_fd, &key) where pseudo_fd later will be replaced with real fd.
  2. actual compilation
  3. do actual map creation and replace pseudo_fd with read map fd.
  4. program load

Since we did not do map creation before program compilation, it is possible to use native clang compilation to generate a .o file and then feed into bcc and continue with step 3 and 4. The major care will be a need to establish a proper relation between pseudo_fd and variable maps, so pseudo_fd later on can be replaced properly later on.

I did a proper prototype some time back but I lost the prorotype. Feel free to explore this.

Native clang compilation will inevitably raise the question how to standardize the C code. One option is to try to conform to current bcc code with bcc helpers.h macros, and bcc style section names, etc. Another option will be trying to conform with what libbpf (as a submodule of bcc) supports, which will require some bcc internal change.

yonghong-song commented 5 years ago

cc @yzhao1012

yonghong-song commented 5 years ago

When we do object loading, we can leverage CO-RE (compile once, run everywhere) which is supported in compiler (llvm 10, not released yet) and libbpf, so the binary could be portable across different kernel versions. Below is a link to @anakryiko 's presentation at LPC2019: https://linuxplumbersconf.org/event/4/contributions/448/attachments/345/575/bpf-usability.pdf

yonghong-song commented 4 years ago

The following code is a hack to dump an object file in bcc

-bash-4.4$ git show                                                                                                                                           [79/1845]
commit 77dd4bfdc615a555d805a612ec56f8eeb1f8516e (HEAD -> emit-to-file)
Author: Yonghong Song <yhs@fb.com>
Date:   Sat Jan 25 22:47:55 2020 -0800

    hack to compile into a file

    An example to use addPassesToEmitFile() is:
      https://llvm.org/docs/tutorial/MyFirstLanguageFrontend/LangImpl08.html

    Signed-off-by: Yonghong Song <yhs@fb.com>

diff --git a/src/cc/bpf_module.cc b/src/cc/bpf_module.cc
index 3df7ef18..64d7b64d 100644
--- a/src/cc/bpf_module.cc
+++ b/src/cc/bpf_module.cc
@@ -35,6 +35,14 @@
 #include <llvm/Transforms/IPO/PassManagerBuilder.h>
 #include <llvm-c/Transforms/IPO.h>

+#include "llvm/Support/FileSystem.h"
+#include "llvm/Support/Host.h"
+#include "llvm/Support/raw_ostream.h"
+#include "llvm/Support/TargetRegistry.h"
+#include "llvm/Support/TargetSelect.h"
+#include "llvm/Target/TargetMachine.h"
+#include "llvm/Target/TargetOptions.h"
+
 #include "common.h"
 #include "bcc_debug.h"
 #include "bcc_elf.h"
@@ -201,7 +209,7 @@ void BPFModule::dump_ir(Module &mod) {
   PM.run(mod);
 }

-int BPFModule::run_pass_manager(Module &mod) {
+int BPFModule::run_pass_manager(Module &mod, bool is_bpf_target) {
   if (verifyModule(mod, &errs())) {
     if (flags_ & DEBUG_LLVM_IR)
       dump_ir(mod);
@@ -220,6 +228,56 @@ int BPFModule::run_pass_manager(Module &mod) {
    * use below 'stable' workaround
    */
   LLVMAddAlwaysInlinerPass(reinterpret_cast<LLVMPassManagerRef>(&PM));
+
+  if (is_bpf_target) {
+    /*
+     * HACK: emit the object file and then abort the process
+     */
+
+    auto TargetTriple = "bpf-pc-linux";
+    mod.setTargetTriple(TargetTriple);
+
+    std::string Error;
+    auto Target = TargetRegistry::lookupTarget(TargetTriple, Error);
+    if (!Target) {
+      errs() << Error;
+      errs() << "Target error";
+      fprintf(stderr, "error: %s %s %d\n", __FILE__, __func__, __LINE__);
+      return -1;
+    }
+
+    auto CPU = "generic";
+    auto Features = "";
+    TargetOptions opt;
+    auto RM = Optional<Reloc::Model>();
+    auto TheTargetMachine =
+        Target->createTargetMachine(TargetTriple, CPU, Features, opt, RM);
+    mod.setDataLayout(TheTargetMachine->createDataLayout());
+
+    auto Filename = "output.o";
+    std::error_code EC;
+    raw_fd_ostream dest(Filename, EC, sys::fs::OF_None);
+    if (EC) {
+      errs() << "dest error";
+      fprintf(stderr, "error: %s %s %d\n", __FILE__, __func__, __LINE__);
+      return -1;
+    }
+
+    auto FileType = CGFT_ObjectFile;
+    if (TheTargetMachine->addPassesToEmitFile(PM, dest, nullptr, FileType)) {
+      errs() << "addPassesToEemitFile error";
+      fprintf(stderr, "error: %s %s %d\n", __FILE__, __func__, __LINE__);
+      return -1;
+    }
+
+    PMB.populateModulePassManager(PM);
+    if (flags_ & DEBUG_LLVM_IR)
+      PM.add(createPrintModulePass(outs()));
+    PM.run(mod);
+    fprintf(stderr, "%s %s %d: success \n", __FILE__, __func__, __LINE__);
+    return -1;
+  }
+
   PMB.populateModulePassManager(PM);
   if (flags_ & DEBUG_LLVM_IR)
     PM.add(createPrintModulePass(outs()));
@@ -472,7 +530,7 @@ int BPFModule::finalize() {
     engine_->setProcessAllSections(true);
 #endif

-  if (int rc = run_pass_manager(*mod))
+  if (int rc = run_pass_manager(*mod, true))
     return rc;
   engine_->finalizeObject();
diff --git a/src/cc/bpf_module.h b/src/cc/bpf_module.h
index 343ce28d..93279186 100644
--- a/src/cc/bpf_module.h
+++ b/src/cc/bpf_module.h
@@ -82,7 +82,7 @@ class BPFModule {
   int load_includes(const std::string &text);
   int load_cfile(const std::string &file, bool in_memory, const char *cflags[], int ncflags);
   int kbuild_flags(const char *uname_release, std::vector<std::string> *cflags);
-  int run_pass_manager(llvm::Module &mod);
+  int run_pass_manager(llvm::Module &mod, bool is_bpf_target);
   StatusTuple sscanf(std::string fn_name, const char *str, void *val);
   StatusTuple snprintf(std::string fn_name, char *str, size_t sz,
                        const void *val);
diff --git a/src/cc/bpf_module_rw_engine.cc b/src/cc/bpf_module_rw_engine.cc
index 418355d3..29d59500 100644
--- a/src/cc/bpf_module_rw_engine.cc
+++ b/src/cc/bpf_module_rw_engine.cc
@@ -351,7 +351,7 @@ string BPFModule::make_writer(Module *mod, Type *type) {
 unique_ptr<ExecutionEngine> BPFModule::finalize_rw(unique_ptr<Module> m) {
   Module *mod = &*m;

-  run_pass_manager(*mod);
+  run_pass_manager(*mod, false);

   string err;
   EngineBuilder builder(move(m));

The run and checkout the object file

-bash-4.4$ sudo ./profile.py 1
Sampling at 49 Hertz of all threads by user + kernel stack for 1 secs.
/home/yhs/work/bcc/src/cc/bpf_module.cc run_pass_manager 277: success 
Traceback (most recent call last):
  File "./profile.py", line 265, in <module>
    b = BPF(text=bpf_text)
  File "/usr/lib/python2.7/site-packages/bcc/__init__.py", line 349, in __init__
    raise Exception("Failed to compile BPF module %s" % (src_file or "<text>"))
Exception: Failed to compile BPF module <text>
-bash-4.4$
-bash-4.4$ readelf -S output.o
There are 28 section headers, starting at offset 0x9610:

Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [ 0]                   NULL             0000000000000000  00000000
       0000000000000000  0000000000000000           0     0     0
  [ 1] .strtab           STRTAB           0000000000000000  00009330
       00000000000002df  0000000000000000           0     0     1
...
  [ 4] .bpf.fn.do_perf_e PROGBITS         0000000000000000  00000810                                                                                           [0/1924]
       0000000000000248  0000000000000000  AX       0     0     8
...
  [19] .BTF              PROGBITS         0000000000000000  00003f58
       0000000000000c6d  0000000000000000           0     0     1
  [20] .rel.BTF          REL              0000000000000000  00008720
       0000000000000060  0000000000000010          27    19     8
  [21] .BTF.ext          PROGBITS         0000000000000000  00004bc5
       0000000000000a98  0000000000000000           0     0     1
  [22] .rel.BTF.ext      REL              0000000000000000  00008780
       0000000000000a90  0000000000000010          27    21     8

With special option, we could dump the object file and load it into bcc on production host.

In the long term, we should try to use libbpf repo for manipulating object file since it has much richer functionality.