mlsecproject / combine

Tool to gather Threat Intelligence indicators from publicly available sources
https://www.mlsecproject.org/
GNU General Public License v3.0
653 stars 179 forks source link

Looks like AS number and name enrichments are not working #67

Closed alexcpsec closed 10 years ago

alexcpsec commented 10 years ago

Maybe I broke it when I updated the files, maybe not. This needs investigating.

krmaxwell commented 10 years ago

What specifically did you observe?

alexcpsec commented 10 years ago

I sampled a few entries from crop.json so that processing time would not be so bad. I took maybe the top 25 ones from yesterday.

After running winnower, all of them had countries (some had rhosts from DNSDB) but NONE had asnumber/asname. So I figured something could be wrong.

krmaxwell commented 10 years ago

Hrm, having trouble reproducing this.

alexcpsec commented 10 years ago

Try with this:

[
  [
    "27.159.210.82", 
    "IPv4", 
    "inbound", 
    "http://www.projecthoneypot.org/list_of_ips.php?rss=1", 
    "", 
    "2014-09-15"
  ], 
  [
    "120.33.245.248", 
    "IPv4", 
    "inbound", 
    "http://www.projecthoneypot.org/list_of_ips.php?rss=1", 
    "", 
    "2014-09-15"
  ], 
  [
    "62.210.148.172", 
    "IPv4", 
    "inbound", 
    "http://www.projecthoneypot.org/list_of_ips.php?rss=1", 
    "", 
    "2014-09-15"
  ], 
  [
    "46.39.255.195", 
    "IPv4", 
    "inbound", 
    "http://www.projecthoneypot.org/list_of_ips.php?rss=1", 
    "", 
    "2014-09-15"
  ], 
  [
    "110.89.36.219", 
    "IPv4", 
    "inbound", 
    "http://www.projecthoneypot.org/list_of_ips.php?rss=1", 
    "", 
    "2014-09-15"
  ], 
  [
    "91.200.12.14", 
    "IPv4", 
    "inbound", 
    "http://www.projecthoneypot.org/list_of_ips.php?rss=1", 
    "", 
    "2014-09-15"
  ]
]
krmaxwell commented 10 years ago

okay this is something I can work with!

krmaxwell commented 10 years ago

OK, major logic fail on my part. Every time we read a new row from data/GeoIPASNum2.csv, we assign to the specified ASN the range in that row. But this is wrong because many ASNs have multiple ranges, including in particular AS4134 Chinanet.

So effectively the dict only contains the last range for each ASN. Working on fixing this now.

alexcpsec commented 10 years ago

AH! Of course. Completely forgot about that as well.