Closed GeorgeMaged closed 7 months ago
Hello,
Thank you for evaluating GoFlow2.
Please specify which version you are using. There is a bug in v2.0.0 but fixed in main
.
You can check the number of flows received using the Prometheus endpoint.
How many flows per second are you emitting?
I'm missing a lot of information on your environment to be able to provide guidance (hardware, software used...). Have a look at the performance doc too.
The only thing that refers to CacheActiveTout
is a Cisco documentation but cannot find a description of what this does.
Actually I am using now goflow v2, when running the containers the goflow2 can't start as it cant find format=bin and its logs are :
2023-12-27 08:59:49 time="2023-12-27T06:59:49Z" level=fatal msg="Format bin not found" 2023-12-27T06:59:50.792073374Z time="2023-12-27T06:59:50Z" level=fatal msg="Format bin not found" 2023-12-27T06:59:52.514715087Z time="2023-12-27T06:59:52Z" level=fatal msg="Format bin not found" 2023-12-27T06:59:54.731755935Z time="2023-12-27T06:59:54Z" level=fatal msg="Format bin not found" 2023-12-27T06:59:56.286318676Z time="2023-12-27T06:59:56Z" level=fatal msg="Format bin not found" 2023-12-27T06:59:58.926916197Z time="2023-12-27T06:59:58Z" level=fatal msg="Format bin not found" 2023-12-27T07:00:02.831912213Z time="2023-12-27T07:00:02Z" level=fatal msg="Format bin not found"
While When I comment this line in the docker-compose file , it starts but have a problem in consuming the flows topic in the clickhouse.
Can you give me the whole command line and more information about your environment (docker versions, etc).
Docker Version:
Client:
Cloud integration: v1.0.35+desktop.5
Version: 24.0.6
API version: 1.43
Go version: go1.20.7
Git commit: ed223bc
Built: Mon Sep 4 12:32:48 2023
OS/Arch: windows/amd64
Context: default
Server: Docker Desktop 4.24.0 (122432)
Engine:
Version: 24.0.6
API version: 1.43 (minimum version 1.12)
Go version: go1.20.7
Git commit: 1a79695
Built: Mon Sep 4 12:32:16 2023
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.6.22
GitCommit: 8165feabfdfe38c65b599c4993d227328c231fca
runc:
Version: 1.1.8
GitCommit: v1.1.8-0-g82f18fe
docker-init:
Version: 0.19.0
GitCommit: de40ad0
The cmd in the goflow2 container :
goflow2:
build:
context: ../../
dockerfile: Dockerfile
args:
VERSION: compose
LDFLAGS: -X main.version=compose
image: netsampler/goflow2
depends_on:
- kafka
ports:
- 8080:8080
- 6343:6343/udp
- 7779:2055/udp
restart: always
command:
- -transport.kafka.brokers=172.29.80.1:6011
- -transport=kafka
- -transport.kafka.topic=flows
- -format=bin
Can you run docker images
?
Then make sure you have the latest version of netsampler/goflow2
by doing docker pull netsampler/goflow2
or adding :e4a14c2
after the image name in docker-compose.yml
:
image: netsampler/goflow2:e4a14c2
It solved the issue !! But there is one last problem regarding the format. Using a dummy netflow generator for testing purposes and here are the format of the flows inside the clickhouse:
and here are the format inside the flows topic of Kafka :
What am I missing? I am using the ' Protobuf ' format in create.sh of clickhouse
For ClickHouse: some data is stored as raw binary for performance. I don't know dbweaver but using direct SQL, the IP addresses in a text form can be obtained using functions like the following:
SELECT if(etype = 0x800, IPv4NumToString(reinterpretAsUInt32(substring(reverse(src_addr), 13,4))), IPv6NumToString(src_addr)) as srcip
I would also recommend to avoid passing queries to flows
since it's a "transition" table before flows_raw
. The data that is queried will be removed and not inserted.
For Kafka: the data is serialized as protobuf which is a binary format. You need other tools to explore this.
This won't be compatible with ClickHouse without modifications, but if you change -format=json
or -format=text
instead of -format=bin
, you'll be able to have a human readable message inside Kafka, albeit a lower performance due to encoding and more traffic.
Yes It is readable but my issue that when I choose -format=json or -format=text , Nothing is consumed in the clickhouse and the db is empty.
Yes: it's on purpose to optimize query performance and storage. Other behaviors won't be supported.
If you wish to use JSON, you need to edit the various flows_*
tables from a Protobuf format into JSON and make sure the names are correctly mapped from the JSON payload into the columns.
This will also break the Grafana dashboards shipping within the compose.
Closing as stale
Hello there, I am using Goflow2 right now for PoC and also I have a netflow collector that already exists in my environment. Actually, I have a doubt or in other words how can I make sure that Goflow collector actually captures all the flows exported from the collector? How can I check if there any missing flows? When I check for example the number of flows in a time window of 5 minutes in Clickhouse, actually I find a difference when aggregating the total number of flows received between Goflow and the collector I have in my environment. So from where we can start to investigate more in that case cause I believe that the flows are the same which are exported from the router ( flow exporter) so what are the diffrences/configurations that can be configured from the collector side?
Also, To have more accurate results, the CacheActiveTout in the exporter is configured to be 300 seconds to avoid cache saturation and loss of data, so is there a corresponding conf. from the collector side(goflow) to be done ?