dbus-fuzzer / dfuzzer

D-Bus fuzzer
GNU General Public License v3.0
37 stars 10 forks source link

Machine-readable logs #75

Open evverx opened 2 years ago

evverx commented 2 years ago

In its current form logs are supposed to look like https://github.com/matusmarhefka/dfuzzer/pull/4 to make reprogen.py work as far as I understand but it would probably make sense to revisit the format to make it easier to parse logs in general. Those logs could help to look for example for timeouts that are ignored by dfuzzer by default.

mrc0mmand commented 2 years ago

One of the major flaws of the current (CSV) format is that the separator (;) can appear in the randomly generated strings, making machine-parsing of the log file harder or sometimes almost impossible.

evverx commented 2 years ago

Those logs could help to look for example for timeouts

Looks like timeouts have never been logged by dfuzzer :-(

mrc0mmand commented 2 years ago

As for the random strings, I guess one possible fix would be to process the strings via https://docs.gtk.org/glib/func.strescape.html before printing them out. This might also help with #80, since strings could be wrapped in " and identified by that. As the documentation suggests, this operation could be easily reversed by https://docs.gtk.org/glib/func.strcompress.html, and the escape sequences should be compatible with bash as well:

Escapes the special characters '\b', '\f', '\n', '\r', '\t', '\v', '\' and '"' in the string source by inserting a '\' before them. Additionally all characters in the range 0x01-0x1F (everything below SPACE) and in the range 0x7F-0xFF (all non-ASCII chars) are replaced with a '\' followed by their octal representation. Characters supplied in exceptions are not escaped.

evverx commented 2 years ago

I'd pick json (or any other format where escaping is no longer an issue) because for example busctl dumps stuff like

{
        "type" : "method_call",
        "endian" : "l",
        "flags" : 0,
        "version" : 1,
        "cookie" : 2,
        "timestamp-realtime" : 1652039190518701,
        "sender" : ":1.147",
        "destination" : "org.freedesktop.resolve1",
        "path" : "/org/freedesktop/resolve1",
        "interface" : "org.freedesktop.resolve1.Manager",
        "member" : "ResolveHostname",
        "payload" : {
                "type" : "isit",
                "data" : [
                        0,
                        "google.com",
                        0,
                        0
                ]
        }
}

and it can be put into "advanced" dictionaries: https://github.com/matusmarhefka/dfuzzer/issues/81. The idea is to monitor the system bus, pick "valid" messages and stuff them into those dictionaries (semi-automatically hopefully)

mrc0mmand commented 2 years ago

That sounds definitely better, and should be relatively easily doable via https://gnome.pages.gitlab.gnome.org/json-glib/ and maybe even with https://gnome.pages.gitlab.gnome.org/json-glib/json-gvariant.html.

mrc0mmand commented 2 years ago

Giving json_gvariant_serialize_data() a quick spin, it seems to work like a charm:

   -- Signature: (isaaai(y(b(n(q(iua{ov})v)o))x(dh))a{t(bov)})
   -- Value: (-2147483648, 'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA', [[@ai []]], (byte 0x00, (false, (int16 -32768, (uint16 0, (-2147483648, uint32 0, {objectpath '/': <'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'>, '/': <'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'>, '/': <'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'>}), <'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'>), objectpath '/')), int64 -9223372036854775808, (1.7976931348623157e+308, handle 0)), {uint64 0: (false, objectpath '/', <'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'>), 0: (false, '/', <'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'>), 0: (false, '/', <'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'>), 0: (false, '/', <'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'>)})
Serialized GVariant: [-2147483648,"AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA",[[[]]],[0,[false,[-32768,[0,[-2147483648,0,{"/":"AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"}],"AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"],"/"]],-9223372036854775808,[1.7976931348623157e+308,0]],{"0":[false,"/","AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"]}]
$ echo '[-2147483648,"AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA",[[[]]],[0,[false,[-32768,[0,[-2147483648,0,{"/":"AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"}],"AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"],"/"]],-9223372036854775808,[1.7976931348623157e+308,0]],{"0":[false,"/","AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"]}]' | jq .
[
  -2147483648,
  "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA",
  [
    [
      []
    ]
  ],
  [
    0,
    [
      false,
      [
        -32768,
        [
          0,
          [
            -2147483648,
            0,
            {
              "/": "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"
            }
          ],
          "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"
        ],
        "/"
      ]
    ],
    -9223372036854775808,
    [
      1.7976931348623157E+308,
      0
    ]
  ],
  {
    "0": [
      false,
      "/",
      "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"
    ]
  }
]

That should, hopefully, be compatible with the format produced by busctl as well.

mrc0mmand commented 2 years ago

Also, would it make sense to log only unsuccessful cases? Something like libfuzzer/AFL does - i.e. log only crashes/timeouts, once such case per file, so they can be then used as 'reproducers' later. Or do we want to log everything into one file, marked by a type of fail (timeout, crash, ...)?