zeek / zeekscript

A toolchain to parse, analyze, and format Zeek scripts
Other
10 stars 3 forks source link

Formatting of multi-line tables and records #67

Open anthonykasza opened 1 year ago

anthonykasza commented 1 year ago

Currently, Zeek has a script in base with the following:

    const exception_codes = {
        [0x01] = "ILLEGAL_FUNCTION",
        [0x02] = "ILLEGAL_DATA_ADDRESS",
        [0x03] = "ILLEGAL_DATA_VALUE",
        [0x04] = "SLAVE_DEVICE_FAILURE",
        [0x05] = "ACKNOWLEDGE",
        [0x06] = "SLAVE_DEVICE_BUSY",
        [0x08] = "MEMORY_PARITY_ERROR",
        [0x0A] = "GATEWAY_PATH_UNAVAILABLE",
        [0x0B] = "GATEWAY_TARGET_DEVICE_FAILED_TO_RESPOND",
    } &default=function(i: count):string { return fmt("unknown-%d", i); } &redef;

The formatter tries to re-format it to:

const exception_codes = { [ 0x01 ] = "ILLEGAL_FUNCTION", [ 0x02 ] =
    "ILLEGAL_DATA_ADDRESS", [ 0x03 ] = "ILLEGAL_DATA_VALUE", [ 0x04 ] =
    "SLAVE_DEVICE_FAILURE", [ 0x05 ] = "ACKNOWLEDGE", [ 0x06 ] =
    "SLAVE_DEVICE_BUSY", [ 0x08 ] = "MEMORY_PARITY_ERROR", [ 0x0A ] =
    "GATEWAY_PATH_UNAVAILABLE", [ 0x0B ] =
    "GATEWAY_TARGET_DEVICE_FAILED_TO_RESPOND",  } &default=function(i: count): string
    {
    return fmt("unknown-%d", i);
    } &redef;

Note the function oneliner was formatted to a multi-line function definition.

Additionally, the following is an example of a record spanning multiple lines and how it is formatted by zeek-format.

  local dummy: connection = [
    $id=[
      $orig_h=1.1.1.1,
      $orig_p=1/tcp,
      $resp_h=2.2.2.2,
      $resp_p=2/tcp
    ],
    $orig=[$size=0, $state=0, $flow_label=0],
    $resp=[$size=0, $state=0, $flow_label=0],
    $start_time=network_time(),
    $duration=0msec,
    $service=set("SSL"),
    $history="",
    $uid="UUIIDD"
  ];
local dummy: connection = [ $id=[ $orig_h=1.1.1.1, $orig_p=1/tcp,
            $resp_h=2.2.2.2, $resp_p=2/tcp ], $orig=[ $size=0, $state=0,
            $flow_label=0 ], $resp=[ $size=0, $state=0, $flow_label=0 ],
            $start_time=network_time(), $duration=0msec, $service=set("SSL"),
            $history="", $uid="UUIIDD" ];

In both cases I find the first version much more readable. This is issue similar to 10.

bbannier commented 1 year ago

I would guess much of this is due to the formatter trying to be smarter than needed (likely: trying to keep long lists on a single line, and then breaking lines further up in the formatter stack when controlling the line width). I'd guess more aggressively inserting line breaks could resolve this, e.g., if an enum decl, record initializer, or function decl or call has more than a single element, always break the line between elements. This might produce slightly more lines than the current impl but would look much more uniform.

If you are interested in working on this @anthonykasza I can point you the right places if needed.

Note the function oneliner was formatted to a multi-line function definition.

This looks like another facet of #7.