efficios / barectf

Generator of ANSI C tracers which output CTF data streams
https://barectf.org
MIT License
65 stars 17 forks source link

Invalid trace (metadata) generated when disabling the `data-stream-type-id-field-type` feature #28

Closed jgalar closed 1 year ago

jgalar commented 1 year ago

Overview

I noticed that barectf (c4c4606) produces invalid traces - or at least, traces that babeltrace 1 & 2 can't decode - when the data-stream-type-id-field-type is disabled (set to false in the configuration file).

Reproducer

My reproducer is based on the barectf-tracepoint example.

  1. Convert the configuration file to the 3.x format
    $ barectf show-effective-cfg config.yaml > config.yaml.tmp && mv config.yaml.tmp config.yaml
  2. Disable the data-stream-type-id-field-type feature by setting data-stream-type-id-field-type to false (the resulting config file follows).
  3. Run the Makefile's default target
    $ make
  4. Generate a trace by running barectf-tracepoint-barectf-linux-fs
    $ ./barectf-tracepoint-barectf-linux-fs
  5. Attempt to read the resulting ctf-linux-fs trace using Babeltrace
    $ babeltrace ctf-linux-fs

config.yaml

%YAML 1.2
--- !<tag:barectf.org,2020/3/config>
options:
  code-generation:
    prefix:
      identifier: barectf_
      file-name: barectf
    header:
      identifier-prefix-definition: true
      default-data-stream-type-name-definition: true
trace:
  type:
    native-byte-order: little-endian
    uuid: auto
    clock-types:
      default:
        offset:
          seconds: 1434072888
        frequency: 1000000000
        $c-type: uint64_t
    $features:
      magic-field-type:
        class: unsigned-integer
        size: 32
        alignment: 32
      uuid-field-type:
        class: static-array
        length: 16
        element-field-type:
          class: unsigned-integer
          size: 8
      data-stream-type-id-field-type: false
    data-stream-types:
      default:
        $is-default: true
        $default-clock-type-name: default
        $features:
          packet:
            total-size-field-type:
              class: unsigned-integer
              size: 32
              alignment: 32
            content-size-field-type:
              class: unsigned-integer
              size: 32
              alignment: 32
            beginning-timestamp-field-type:
              class: unsigned-integer
              size: 64
              alignment: 64
            end-timestamp-field-type:
              class: unsigned-integer
              size: 64
              alignment: 64
            discarded-event-records-counter-snapshot-field-type:
              class: unsigned-integer
              size: 32
              alignment: 32
          event-record:
            type-id-field-type:
              class: unsigned-integer
              size: 16
              alignment: 16
            timestamp-field-type:
              class: unsigned-integer
              size: 64
              alignment: 64
        event-record-types:
          barectf_tp_simple_uint32:
            log-level: 2
            payload-field-type:
              class: structure
              members:
              - value:
                  field-type:
                    class: unsigned-integer
                    size: 32
                    alignment: 32
          barectf_tp_simple_int16:
            payload-field-type:
              class: structure
              members:
              - value:
                  field-type:
                    class: signed-integer
                    size: 16
                    alignment: 16
          barectf_tp_simple_float:
            payload-field-type:
              class: structure
              members:
              - value:
                  field-type:
                    class: real
                    size: 32
                    alignment: 32
          barectf_tp_simple_string:
            log-level: 4
            payload-field-type:
              class: structure
              members:
              - value:
                  field-type:
                    class: string
          barectf_tp_simple_enum:
            payload-field-type:
              class: structure
              members:
              - value:
                  field-type:
                    class: unsigned-enumeration
                    size: 8
                    alignment: 8
                    mappings:
                      NEW:
                      - 0
                      TERMINATED:
                      - 1
                      READY:
                      - 2
                      RUNNING:
                      - 3
                      WAITING:
                      - 4
          barectf_tp_a_few_fields:
            payload-field-type:
              class: structure
              members:
              - int32:
                  field-type:
                    class: signed-integer
                    size: 32
                    alignment: 32
              - uint16:
                  field-type:
                    class: unsigned-integer
                    size: 16
                    alignment: 16
              - dbl:
                  field-type:
                    class: real
                    size: 64
                    alignment: 64
              - str:
                  field-type:
                    class: string
              - state:
                  field-type:
                    class: unsigned-enumeration
                    size: 8
                    alignment: 8
                    mappings:
                      NEW:
                      - 0
                      TERMINATED:
                      - 1
                      READY:
                      - 2
                      RUNNING:
                      - 3
                      WAITING:
                      - 4
          barectf_tp_bit_packed_integers:
            log-level: 513
            payload-field-type:
              class: structure
              minimum-alignment: 8
              members:
              - uint1:
                  field-type:
                    class: unsigned-integer
                    size: 1
                    alignment: 1
              - int1:
                  field-type:
                    class: signed-integer
                    size: 1
                    alignment: 1
              - uint2:
                  field-type:
                    class: unsigned-integer
                    size: 2
                    alignment: 1
              - int3:
                  field-type:
                    class: signed-integer
                    size: 3
                    alignment: 1
              - uint4:
                  field-type:
                    class: unsigned-integer
                    size: 4
                    alignment: 1
              - int5:
                  field-type:
                    class: signed-integer
                    size: 5
                    alignment: 1
              - uint6:
                  field-type:
                    class: unsigned-integer
                    size: 6
                    alignment: 1
              - int7:
                  field-type:
                    class: signed-integer
                    size: 7
                    alignment: 1
              - uint8:
                  field-type:
                    class: unsigned-integer
                    size: 8
                    alignment: 1
...

Error messages produced by Babeltrace 1 & 2

Reading the resulting trace with Babeltrace 2 produces the following error

❯ babeltrace ctf-linux-fs
03-07 11:27:07.991 57675 57675 E PLUGIN/CTF/META/IR-VISITOR visit_stream_decl@visitor-generate-ir.cpp:3649 [auto-disc-source-ctf-fs] At line 137 in metadata stream: Stream class has a `id` attribute, but trace's packet header field class has no `stream_id` field.
03-07 11:27:07.991 57675 57675 E PLUGIN/CTF/META/IR-VISITOR ctf_visitor_generate_ir_visit_node@visitor-generate-ir.cpp:4671 [auto-disc-source-ctf-fs] At line 137 in metadata stream: Cannot visit stream class: ret=-22
03-07 11:27:07.991 57675 57675 E PLUGIN/CTF/META/DECODER ctf_metadata_decoder_append_content@decoder.cpp:338 [auto-disc-source-ctf-fs] Failed to visit AST node to create CTF IR objects: mdec-addr=0x55a7f5c55310, ret=-22
03-07 11:27:07.991 57675 57675 E PLUGIN/SRC.CTF.FS/META ctf_fs_metadata_set_trace_class@metadata.cpp:105 [auto-disc-source-ctf-fs] Cannot update metadata decoder's content.
03-07 11:27:07.992 57675 57675 E PLUGIN/SRC.CTF.FS ctf_fs_component_create_ctf_fs_trace_one_path@fs.cpp:1132 [auto-disc-source-ctf-fs] Cannot create trace for `/home/jgalar/EfficiOS/src/barectf/examples/barectf-tracepoint/ctf-linux-fs`.
03-07 11:27:07.992 57675 57675 W LIB/GRAPH add_component_with_init_method_data@graph.c:1055 Component initialization method failed: status=ERROR, comp-addr=0x55a7f5c547c0, comp-name="auto-disc-source-ctf-fs", comp-log-level=WARNING, comp-class-type=SOURCE, comp-class-name="fs", comp-class-partial-descr="Read CTF traces from the file sy", comp-class-is-frozen=1, comp-class-so-handle-addr=0x55a7f5c5c9e0, comp-class-so-handle-path="/usr/lib/babeltrace2/plugins/babeltrace-plugin-ctf.la", comp-input-port-count=0, comp-output-port-count=0
03-07 11:27:07.992 57675 57675 E CLI cmd_run_ctx_create_components_from_config_components@babeltrace2.c:2284 Cannot create component: plugin-name="ctf", comp-cls-name="fs", comp-cls-type=SOURCE, comp-name="auto-disc-source-ctf-fs"
03-07 11:27:07.992 57675 57675 E CLI cmd_run@babeltrace2.c:2470 Cannot create components.

ERROR:    [Babeltrace CLI] (babeltrace2.c:2470)
  Cannot create components.
CAUSED BY [Babeltrace CLI] (babeltrace2.c:2284)
  Cannot create component: plugin-name="ctf", comp-cls-name="fs", comp-cls-type=SOURCE, comp-name="auto-disc-source-ctf-fs"
CAUSED BY [libbabeltrace2] (graph.c:1055)
  Component initialization method failed: status=ERROR, comp-addr=0x55a7f5c547c0, comp-name="auto-disc-source-ctf-fs", comp-log-level=WARNING,
  comp-class-type=SOURCE, comp-class-name="fs", comp-class-partial-descr="Read CTF traces from the file sy", comp-class-is-frozen=1,
  comp-class-so-handle-addr=0x55a7f5c5c9e0, comp-class-so-handle-path="/usr/lib/babeltrace2/plugins/babeltrace-plugin-ctf.la", comp-input-port-count=0,
  comp-output-port-count=0
CAUSED BY [auto-disc-source-ctf-fs: 'source.ctf.fs'] (fs.cpp:1132)
  Cannot create trace for `/home/jgalar/EfficiOS/src/barectf/examples/barectf-tracepoint/ctf-linux-fs`.
CAUSED BY [auto-disc-source-ctf-fs: 'source.ctf.fs'] (decoder.cpp:338)
  Failed to visit AST node to create CTF IR objects: mdec-addr=0x55a7f5c55310, ret=-22
CAUSED BY [auto-disc-source-ctf-fs: 'source.ctf.fs'] (visitor-generate-ir.cpp:4671)
  At line 137 in metadata stream: Cannot visit stream class: ret=-22
CAUSED BY [auto-disc-source-ctf-fs: 'source.ctf.fs'] (visitor-generate-ir.cpp:3649)
  At line 137 in metadata stream: Stream class has a `id` attribute, but trace's packet header field class has no `stream_id` field.

Reading the resulting trace with Babeltrace 1.5 produces the following error

❯ babeltrace /home/jgalar/EfficiOS/src/barectf/examples/barectf-tracepoint/ctf-linux-fs
[error] ctf_stream_visit: missing stream_id field in packet header declaration, but stream_id attribute is declared for stream.
[error] ctf_visitor_construct_metadata: stream declaration error
[error] Error in CTF metadata constructor -1
[warning] Unable to open trace metadata for path "/home/jgalar/EfficiOS/src/barectf/examples/barectf-tracepoint/ctf-linux-fs".
[warning] [Context] Cannot open_trace of format ctf at path /home/jgalar/EfficiOS/src/barectf/examples/barectf-tracepoint/ctf-linux-fs.
[warning] [Context] cannot open trace "/home/jgalar/EfficiOS/src/barectf/examples/barectf-tracepoint/ctf-linux-fs" from /home/jgalar/EfficiOS/src/barectf/examples/barectf-tracepoint/ctf-linux-fs for reading.
[error] Cannot open any trace for reading.

[error] opening trace "/home/jgalar/EfficiOS/src/barectf/examples/barectf-tracepoint/ctf-linux-fs" for reading.

[error] none of the specified trace paths could be opened.

Cause

The stream and event classes declared by barectf in the metadata file reference the stream_id both implicitly (id) or explicitly (stream_id) which appears unexpected by both babeltrace versions.

stream {
        id = 0;
        packet.context := struct {
                integer {
                        signed = false;
                        size = 32;
                        align = 32;
                        byte_order = native;
                        base = 10;
                } packet_size;
...
event {
        stream_id = 0;
        id = 1;
        name = "barectf_tp_bit_packed_integers";
        loglevel = 513;
...

I have to check the CTF v1.8 spec, but I would be surprised if it addressed this case explicitly. Nonetheless, I think the easiest fix here is to omit those IDs when the feature is disabled and perhaps warn if the feature is disabled and the configuration contains more than one stream.

ebugden commented 1 year ago

perhaps warn if the feature is disabled and the configuration contains more than one stream

Aren't stream IDs needed to function correctly if there is more than one stream? (Underlying question: Would it make more sense to fail than to warn in this case?)


Currently, the configuration parsing will fail if the it has more than one stream, but the data stream ids are undefined (i.e. when data-stream-type-id-field-type is undefined).

config_parse_v3.py:641 Pseudo code:

if data_stream_id_field_type is None and data_stream_count > 1:
    raise _ConfigurationParseError()

Actual code:

if dst_id_ft is None and dst_count > 1:
    raise _ConfigurationParseError(f'`{dst_id_ft_prop_name}` property',
                                   'Data stream type ID field type feature is required because trace type has more than one data stream type')

More context For example, the following barectf 2 config:

version: '2.2'
metadata:
  type-aliases:
    uint16:
      class: int
      size: 16
  trace:
    byte-order: little-endian
  streams:
    stream1:
      $default: true
      packet-context-type:
        class: struct
        fields:
          packet_size: uint16
          content_size: uint16
      events:
        my_event:
          payload-type:
            class: struct
            fields:
              my_field:
                class: int
                size: 8
    stream2:
      $default: true
      packet-context-type:
        class: struct
        fields:
          packet_size: uint16
          content_size: uint16
      events:
        my_event:
          payload-type:
            class: struct
            fields:
              my_field:
                class: int
                size: 8

gives this error:

Configuration: Cannot create configuration from YAML file
Trace:
Trace type:
`$features` property:
`data-stream-type-id-field-type` property: Data stream type ID field type feature is required because trace type has more than one data stream type
jgalar commented 1 year ago

Aren't stream IDs needed to function correctly if there is more than one stream? (Underlying question: Would it make more sense to fail than to warn in this case?)

As far as I know, they are. Indeed, it would make more sense to fail in that case.

ebugden commented 1 year ago

Proposed fix.

The fix commit message has more details, but it looks like barectf is currently in a grey area in the CTF 1.8 spec and that this issue is a shared responsibility (between barectf and CTF readers). I haven't looked at the babeltrace code, so exactly why/where things go wrong in babeltrace is speculation, but the fix does allow babeltrace to read the trace.

I'm going to report an issue in babeltrace (even though it's not necessarily worth fixing) mainly to document it.


ebugden: Aren't stream IDs needed to function correctly if there is more than one stream? (Underlying question: Would it make more sense to fail than to warn in this case?)

jgalar: As far as I know, they are. Indeed, it would make more sense to fail in that case.

My updated understanding of the code is that configuration will fail if:

So it looks like this already covers what should happen.

The code below is for more context, if desired.

config_parse_v3.py:631

            if features_node is not None:
                magic_ft = self._feature_ft(features_node, 'magic-field-type', magic_ft)
                uuid_ft = self._feature_ft(features_node, 'uuid-field-type', uuid_ft)
                # The following line sets 'dst_id_ft' to None if it is configured false
                dst_id_ft = self._feature_ft(features_node, dst_id_ft_prop_name, dst_id_ft)

            dsts_prop_name = 'data-stream-types'
            dst_count = len(self._trace_type_node[dsts_prop_name])

            try:
                # Raise a configuration error if data stream IDs aren't specified and there is more than one data stream 
                if dst_id_ft is None and dst_count > 1:
                    raise _ConfigurationParseError(f'`{dst_id_ft_prop_name}` property',
                                                   'Data stream type ID field type feature is required because trace type has more than one data stream type')
ebugden commented 1 year ago

🌈 Fixed!

@eepp This issue can be closed :sunflower: