iovisor / bcc

BCC - Tools for BPF-based Linux IO analysis, networking, monitoring, and more
Apache License 2.0
20.49k stars 3.87k forks source link

Reading from unaligned structs (class) is causing invalid values. #2017

Open tkc17 opened 6 years ago

tkc17 commented 6 years ago

When the below structure is used in the BPF Code, packet_num works fine, but if the same is retrieved using python it doesn't work. struct reorder_info { u16 seq_num;
u32 packet_num; }__packed;

In the above structs, even though its packed, python inserts "holes" to try and align the struct because BCC doesn't understand the "__packed" macro as its from Linux Kernel. While accessing "packet_num" the python code starts at the hole offset causing invalid values.

BPF should understand packed from and refrain from aligning the struct.

yonghong-song commented 6 years ago

Do you have a concrete reproducible test case? You mean in python with a ctype structure definition like above, right? Python C type can specify packed (attribute _pack_ = 1), maybe you want to try?

tkc17 commented 6 years ago

Yes, I do have a reproducible case. The above struct is defined in the BPF C code with __packed attribute and stored as a BPF_TABLE("hash", tp). In python i am using bpf_handle["tp"][0] (Type is 'class bcc.reorder_info') to fetch the contents and thats where the problem comes, I don't have control over BCC which is converting the BPF_TABLE as a class.

tkc17 commented 6 years ago

packed_structs_bpf.py.zip This is an example to show the issue. To get it working just swap the a, b definitions and it works.

yonghong-song commented 6 years ago

Thanks for the test case. I can reproduce the issue. The following change can fix the issue.

diff --git a/src/cc/json_map_decl_visitor.cc b/src/cc/json_map_decl_visitor.cc
index d54a71c..c7fe9b8 100644
--- a/src/cc/json_map_decl_visitor.cc
+++ b/src/cc/json_map_decl_visitor.cc
@@ -159,8 +159,12 @@ bool BMapDeclVisitor::VisitRecordDecl(RecordDecl *D) {
   result_ += "]";
   if (D->isUnion())
     result_ += ", \"union\"";
-  else if (D->isStruct())
-    result_ += ", \"struct\"";
+  else if (D->isStruct()) {
+    if (SkipPadding)
+      result_ += ", \"struct\"";
+    else
+      result_ += ", \"struct_packed\"";
+  }
   result_ += "]";
   return true;
 }
diff --git a/src/python/bcc/__init__.py b/src/python/bcc/__init__.py
index 38fa6d8..1dfd830 100644
--- a/src/python/bcc/__init__.py
+++ b/src/python/bcc/__init__.py
@@ -427,7 +427,8 @@ class BPF(object):
                 elif isinstance(t[2], int):
                     fields.append((t[0], BPF._decode_table_type(t[1]), t[2]))
                 elif isinstance(t[2], basestring) and (
-                        t[2] == u"union" or t[2] == u"struct"):
+                        t[2] == u"union" or t[2] == u"struct" or
+                        t[2] == u"struct_packed"):
                     name = t[0]
                     if name == "":
                         name = "__anon%d" % len(anon)
@@ -438,13 +439,21 @@ class BPF(object):
             else:
                 raise Exception("Failed to decode type %s" % str(t))
         base = ct.Structure
+        is_packed = False
         if len(desc) > 2:
             if desc[2] == u"union":
                 base = ct.Union
             elif desc[2] == u"struct":
                 base = ct.Structure
-        cls = type(str(desc[0]), (base,), dict(_anonymous_=anon,
-            _fields_=fields))
+            elif desc[2] == u"struct_packed":
+                base = ct.Structure
+                is_packed = True
+        if is_packed:
+            cls = type(str(desc[0]), (base,), dict(_anonymous_=anon, _pack_=1,
+                _fields_=fields))
+        else:
+            cls = type(str(desc[0]), (base,), dict(_anonymous_=anon,
+                _fields_=fields))
         return cls

     def get_table(self, name, keytype=None, leaftype=None, reducer=None):

I will propose a formal patch later with your test case. Thanks!

tkc17 commented 6 years ago

Thanks. A couple of comments

  1. Should we also update the definition to cover packed structs

  2. Shouldn't the check for padding be reverse: if (SkipPadding) struct_packed else struct

yonghong-song commented 6 years ago

The pack attribute is lost in clang so we do not know whether a structure has packed attribute or not, although we may guess based on offset/size etc.

What we do here is try to explicitly pad the structure. If we did pack the structure (just fill the holes), we can safely add _pack_=1 for python. Otherwise, we did not do padding, we cannot add _pack_=1 as it may change semantics. The SkipPadding tells whether padding happens or not in C++.

lesovsky commented 5 years ago

I had the similar issue in conjunction with Go https://stackoverflow.com/questions/53324158/golang-ebpf-and-functions-duration