netobserv / flowlogs-pipeline

Transform flow logs into metrics
Apache License 2.0
69 stars 21 forks source link

Documentation issue #205

Closed innogrey closed 3 months ago

innogrey commented 2 years ago

Apologies, if I missed something obvious, but are the particular modules documented in detail somewhere? E.g. README.md doesn't contain any info on ingest methods other than json, there's no info on protobuf, etc, etc. Lurking through source code isn't particularly effective line of action here.

github-actions[bot] commented 2 years ago

Congratulations for contributing your first flowlogs-pipeline issue

jotak commented 2 years ago

hey @innogrey , thank you for the feedback. There's more documentation here: https://github.com/netobserv/flowlogs-pipeline/blob/main/docs/api.md#ingest-collector-api , although it maybe needs to be refreshed a little bit. The "ingest collector API" actually refers to listening IPFIX or netflow v5 (UDP). The GRPC ingester uses the protobuf definition of our ebpf agent that you can find here: https://github.com/netobserv/netobserv-ebpf-agent/blob/main/proto/flow.proto

@mariomac @eranra please correct if I'm wrong

@innogrey does it meet your needs?

innogrey commented 2 years ago

Ok, so to be sure, using the igest: collector method, I don't need an external instance of goflow2? That makes sense, but then, what configuration.yaml format should I use? Following example gives me FATAL failed to initialize pipeline ingest hostname not specified :

Again, I miss the documentation of the individual pipeline segments ;)

jotak commented 2 years ago

hey @innogrey , You're right that you don't need to install goflow2, as it is already used under the cover. I don't see how your yaml is indented above but it's probably indeed an indentation issue. Here's a small example that works for me:

test.yaml

log-level: info
parameters:
- ingest:
    collector:
      hostName: 0.0.0.0
      port: 2055
    type: collector
  name: ingest_collector
- decode:
    type: json
  name: decode_json
- name: stdout
  write:
    type: stdout
pipeline:
- name: ingest_collector
- follows: ingest_collector
  name: decode_json
- follows: decode_json
  name: stdout

Run it with:

./flowlogs-pipeline --config ./test.yaml 

what decode: type should I use? I guess internally it is not json.

It has to be json at the moment but it's definitely something that can be improved cc @eranra @mariomac .. it sounds like an easy performance win ( => issue created https://github.com/netobserv/flowlogs-pipeline/issues/212 )