certtools / intelmq

IntelMQ is a solution for IT security teams for collecting and processing security feeds using a message queuing protocol.
https://docs.intelmq.org/latest/
GNU Affero General Public License v3.0
976 stars 297 forks source link

Expert Bot - Cymru has a parsing bug #35

Closed SYNchroACK closed 6 years ago

SYNchroACK commented 10 years ago

Cymru two ASNs from one IP

SYNchroACK commented 10 years ago

Event:

feed=arbor, reported_ip=219.234.88.247, source_time=2014-07-11T09:48:18.020624, feed_url=http://atlas-public.ec2.arbor.net/public/ssh_attackers, ip=219.234.88.247, observation_time=2014-07-11T09:48:18.020645, source_ip=219.234.88.247, registry=apnic, taxonomy="Intrusion Attempts", bgp_prefix=219.234.80.0/20, allocated=2002-04-17, cymru_cc=CN, type=brute-force, asn="9395 17431"

Cymru Whois:

$ whois -h whois.cymru.com " -v 219.234.88.247"
[Querying whois.cymru.com]
[whois.cymru.com]
AS      | IP               | BGP Prefix          | CC | Registry | Allocated  | AS Name
9395    | 219.234.88.247   | 219.234.80.0/20     | CN | apnic    | 2002-04-17 | GUOXIN-BILIN BeiJing Guoxin bilin Telecom Technology Co.,Ltd,CN
17431   | 219.234.88.247   | 219.234.80.0/20     | CN | apnic    | 2002-04-17 | TONET Beijing TONEK Information Technology Development Company,CN
aaronkaplan commented 10 years ago

recommendation for an algorithm: 1) take the line which has the most specific closest BGP prefix match If they are both the same, then 2) take the first entry (the first ASN)

SYNchroACK commented 10 years ago

Thank you Aaron for your feedback.

Search the following text in http://www.team-cymru.org/Services/ip-to-asn.html#dns

$ dig +short 31.108.90.216.peer.asn.cymru.com TXT
 "701 1239 3549 3561 7132 | 216.90.108.0/24 | US | arin | 1998-09-25"

It seems that the only difference from normal cases is ASN with multiple values in DNS Query output (the current Cymru bot use DNS because its more faster that Whois).

Do you have any feedback regarding this new information? Should we just get the first ASN and ignore the other ones?

Again, Thak you for your help ;)

EDITED: Check line 78 in https://github.com/certtools/intelmq/blob/master/src/bots/experts/cymru/cymrulib.py

aaronkaplan commented 10 years ago

peer.asn.cymru.com means: "The peer.asn.cymru.com zone is used to map an IP address or prefix to the possible BGP peer ASNs that are one AS hop away from the BGP Origin ASN's prefix."

So that is the ASN which are one hope away. That's not the ASN of the IP! Careful not to confuse the two.

SYNchroACK commented 10 years ago

Oh sorry... my mistake choosing the example from Cymru page. The question still the same... look at following example:

$ dig +short 247.88.234.219.origin.asn.cymru.com TXT
"9395 17431 | 219.234.80.0/20 | CN | apnic | 2002-04-17"

Which ASN should the bot choose?

aaronkaplan commented 10 years ago

Hmmm.. I checked in my BGP routing table and it says that IP address range is reachable via AS 9395. I'd recommend to ask Cymru.

SYNchroACK commented 10 years ago

Team Cymru answer:

The reason you see two ASNs for your query is as follows:

We monitor route announcements from multiple locations and from multiple providers.
In some cases a network prefix will be announced by multiple, but disparate, networks
or autonomous systems. The most likely reason for this is something known as
"multihoming". This is perfectly normal. Depending on your view of the Internet topology
and the originating network's policies, one of those originating networks will be the
preferred path for sending and receiving traffic with the netblock in question. We
choose to show you the list of all those we know about.

What is your feedback Aaron?

aaronkaplan commented 10 years ago

Hmmm... well, I'd take the first one then for now. We have to think then how to actually deal with multiple ASNs per IP in the future.

SYNchroACK commented 10 years ago

What do you think about create multiple similar events, one per ASN. Make sense or its just stupid to duplicate? Or should we go back to multiple fields feature? How to handle generate CSVs files from events with multiple values for each key?

....lot of questions.... i have this issue in other similar projects and i never found a really good solution...

Note: abusehelper supports multiple values but in the end, when sends information drops all ASNs except one. :neckbeard:

Rafiot commented 10 years ago

This is the tool we use for that: https://github.com/CIRCL/IP-ASN-history

My approach is the following: I get the bview file from RIPE, and for each object, I get the latest AS in the path.

For the ASN description, we use this one: https://github.com/CIRCL/ASN-Description-History

ghost commented 6 years ago

Closing this as duplicate in favor of #543 as the other report has a more recent discussion including the temporary (for some releases now) fix. Also the title is more meaningful.