certtools / intelmq

IntelMQ is a solution for IT security teams for collecting and processing security feeds using a message queuing protocol.
https://docs.intelmq.org/latest/
GNU Affero General Public License v3.0
972 stars 295 forks source link

Spamhaus CERT parser uses wrong field #2165

Open gethvi opened 2 years ago

gethvi commented 2 years ago

I believe this should be using source.port:

https://github.com/certtools/intelmq/blob/ce60fd4bf65b0ca2d98ac05c4db658890dbac1a3/intelmq/bots/parsers/spamhaus/parser_cert.py#L168

sebix commented 2 years ago

Do you have an example for it? The header of the file says "local_port". If in doubt, we can just ask them.

gethvi commented 2 years ago
; 1 - Infected IP
; 2 - ASN
; 3 - Country Code
; 4 - Lastseen Timestamp (in UTC)
; 5 - Bot Name
;   Command & Control (C&C) information, if available:
; 6 - C&C Domain
; 7 - Remote IP (connecting to)
; 8 - Remote Port (connecting to)
; 9 - Local Port
; 10 - Protocol
;   Additional fields may be added in the future without notice
;
; ip, asn, country, lastseen, botname, domain, remote_ip, remote_port, local_port, protocol
;
95.xxx.xxx.xxx,AS43708,CZ,1650418723,ranbyus,ewapgpmqnwneejqgo.org,216.218.185.162,80,46254,tcp
95.xxx.xxx.xxx,AS43708,CZ,1649399089,matsnu,managementpause.com,216.218.185.162,80,51646,tcp

When the event describes a connection (src -> dst), I always assumed that source.port means local_port. Is that wrong?

sebix commented 2 years ago

When the event describes a connection (src -> dst), I always assumed that source.port means local_port. Is that wrong?

Depends on the meaning of local port. If, e.g. you make an connection to a remote service, the local port is some high-range port, more or less used just used internally at the host, not really useful on the outside. In that terms, it is local, yes. The second thing is source vs destination which is very tricky in IntelMQ, because the data format uses these terms for historical reasons but with a different meaning. Originally these terms are like network flow data (source and destination of a network connection). But with describing IoCs, that becomes tricky as the relevant part (source or destination) varies by type of the IoC (classification.type) and is not even identical within the taxonomies. For that reason, the decision was that source is always the part which we care most about, even if it is the destination from a network perspective (e.g. if it's a C&C server, to which the clients connect to usually, that's the source). See also https://intelmq.readthedocs.io/en/maintenance/dev/data-format.html#meaning-of-source-and-destination-identities

gethvi commented 2 years ago

Could we introduce source.ephemeral_port and destination.ephemeral_port which would make it clear where the connection originated?

Example: destination.ephemeral_port -> source.port (connection made from "destination" host) source.ephemeral_port -> destination.port (connection made from "source" host)

I would not consider this information "not really useful". When we send an abuse notice, the recipient often times likes to verify what we are reporting in their own logging/monitoring system. This information makes it easier for them to do such verification.