manuelkasper / AS-Stats

A simple tool to generate per-AS traffic graphs from NetFlow/sFlow records
BSD 2-Clause "Simplified" License
197 stars 66 forks source link

generate-asinfo generates wrong information #57

Open twiddern opened 7 years ago

twiddern commented 7 years ago

I've just noticed it some months ago but didn't raised an issue here and also couldn't managed it to make a fix.

Currently the generate-asinfo.py generates broken output. The information seems to be ok if you look manually into it, but if you load the asinfo.txt often ripe-entries are broken. Also information from other databases, which cymru collected seems sometimes a bit "wrong".

I'm not sure if the python script needs to be adjusted or if as-stats should be adjusted to show up the information from a "new format" from cymru. I'm sorry that I can't give an example right now.

For whom who it will fix, I also noticed that you can't query 300k asn at once, via netcat to cymru, this should be may split up in future.

crazzy commented 7 years ago

I have spent some time writing up a new script following the same format as the existing bundled file. My script is far from perfect but it works, some points that still need fixing is:

But what my script does do is that it goes directly to the source, checking at IANA where everything is assigned, then queries the correct whois server for the data. Additionally downloads a file from RIPE FTP to do the asn to country mapping for RIPE. Also my script doesn't check for data for every single possible ASN. Only for the ASN's where we've actually seen traffic from. So maybe something to refine and possibly include in as-stats under contrib?

#!/usr/bin/zsh

zmodload -m -F zsh/files b:zf_\*

rrd_path=/root/AS-Stats/rrd
asinfo_path=/tmp/asinfo.txt
asn16_tmp=""
asn32_tmp=""
asn_mapper_tmp=""
ripe_as_cc_tmp=""
curl=/usr/bin/curl
whois=/usr/bin/whois
sed=/bin/sed
grep=/bin/grep
awk=/usr/bin/awk
iana_as_alloc_16="http://www.iana.org/assignments/as-numbers/as-numbers-1.csv"
iana_as_alloc_32="http://www.iana.org/assignments/as-numbers/as-numbers-2.csv"
ripe_as_to_country="ftp://ftp.ripe.net/pub/stats/ripencc/delegated-ripencc-latest"

get_as_data() {
    case $asnum in # There are a number of special purpose AS numbers that we should handle manually
        0)
            return # Reserved RFC 7607
        ;;
        112)
            echo -e "112\tROOTSERV\tDNS-OARC,US\tUS" # RFC 7534
        ;;
        23456)
            echo -e "23456\tIANA-ASTRANS\tIANA,US\tUS" # RFC 6793
        ;;
        <64496-64511>)
            return # Reserved for documentation RFC 5398
        ;;
        <64512-65534>)
            echo -e "$asnum\tPRIVATE-AS-$asnum\tPRIVATE,US\tUS" # RFC 6996, could be valid for the purpose of AS-Stats
        ;;
        65535)
            return # Reserved last AS RFC 7300
        ;;
        <65536-65551>)
            return # Reserved for documentation RFC 5398
        ;;
        <4200000000-4294967294>)
            echo -e "$asnum\tPRIVATE-AS-$asnum\tPRIVATE,US\tUS" # RFC 6996, could be valid for the purpose of AS-Stats
        ;;
        4294967295)
            return # Reserved last AS RFC 7300
        ;;
        *)
            query_for_as $asnum
        ;;
    esac
}

handler_arin() {
    asnum="$1"
    whois_server="$2"
    rawdata=$($whois -h $whois_server AS$asnum 2>/dev/null)
    asname=$(echo "$rawdata" | $grep -m1 '^ASName:' | $awk '{print $NF;}')
    orgname=$(echo "$rawdata" | $grep -m1 '^OrgName:' | $awk -F : '{print $NF;}' | $sed 's,^\s+,,g')
    country=$(echo "$rawdata" | $grep -m1 '^Country:' | $awk '{print $NF;}')
    if [ "$asname" = "" ]; then
        return
    fi
    echo -e "$asnum\t$asname\t$orgname,$country\t$country"
}

handler_ripe() {
    asnum="$1"
    whois_server="$2"
    rawdata=$($whois -h $whois_server AS$asnum 2>/dev/null)
    asname=$(echo "$rawdata" | $grep -m1 '^as-name:' | $awk '{print $NF;}')
    orgname=$(echo "$rawdata" | $grep -m1 '^org-name:' | $awk -F : '{print $NF;}' | $sed 's,^\s+,,g')
    country=$($grep -m1 "^ripencc.*asn.$asnum" $ripe_as_cc_tmp | $awk -F '|' '{print $4}')
    if [ "$asname" = "" ]; then
        return
    fi
    echo -e "$asnum\t$asname\t$orgname,$country\t$country"
}

handler_apnic() {
    asnum="$1"
    whois_server="$2"
    rawdata=$($whois -h $whois_server AS$asnum 2>/dev/null)
    echo "$rawdata" | $grep -q '^remarks:.*whois.nic.ad.jp'
    if [ $? -ne 0 ]; then # general apnic
        asname=$(echo "$rawdata" | $grep -m1 '^as-name:' | $awk '{print $NF;}')
        orgname=$(echo "$rawdata" | $grep -A1 -m1 '^as-name:' | $grep '^descr:' | $awk -F : '{print $NF;}' | $sed 's,^\s+,,g')
        country=$(echo "$rawdata" | $grep -m1 '^country:' | $awk '{print $NF;}')
    else #JPNIC
        rawdata=$($whois -h whois.nic.ad.jp AS$asnum/e 2>/dev/null)
        asname=$(echo "$rawdata" | $grep -m1 '^b\.' | $awk '{print $NF;}')
        orgname=$(echo "$rawdata" | $grep -m1 '^g\.' | $awk -F ']' '{print $NF;}' | $sed 's,^\s+,,g')
        country="JP"
    fi
    if [ "$asname" = "" ]; then
        return
    fi
    echo -e "$asnum\t$asname\t$orgname,$country\t$country"
}

handler_lacnic() {
    asnum="$1"
    whois_server="$2"
    rawdata=$($whois -h $whois_server AS$asnum 2>/dev/null)
    asname="UNSPECIFIED"
    orgname=$(echo "$rawdata" | $grep -m1 '^owner:' | $awk -F : '{print $NF;}' | $sed 's,^\s+,,g')
    country=$(echo "$rawdata" | $grep -m1 '^country:' | $awk '{print $NF;}')
    if [ "$asname" = "" ]; then
        return
    fi
    echo -e "$asnum\t$asname\t$orgname,$country\t$country"
}

handler_afrinic() {
    asnum="$1"
    whois_server="$2"
    rawdata=$($whois -h $whois_server AS$asnum 2>/dev/null)
    asname=$(echo "$rawdata" | $grep -m1 '^as-name:' | $awk '{print $NF;}')
    orgname=$(echo "$rawdata" | $grep -m1 '^org-name:' | $awk -F : '{print $NF;}' | $sed 's,^\s+,,g')
    country=$(echo "$rawdata" | $grep -m1 '^country:' | $awk '{print $NF;}')
    echo -e "$asnum\t$asname\t$orgname,$country\t$country"
}

query_for_as() {
    prepare_rir_maps
    asnum="$1"
    . $asn_mapper_tmp
    case $as_alloc in
        *ARIN*)
            handler_arin $asnum $whois_server
        ;;
        *RIPE*)
            handler_ripe $asnum $whois_server
        ;;
        *LACNIC*)
            handler_lacnic $asnum $whois_server
        ;;
        *APNIC*)
            handler_apnic $asnum $whois_server
        ;;
        *AFRINIC*)
            handler_afrinic $asnum $whois_server
        ;;
        *)
            return # Unknown RIR
        ;;
    esac
}

prepare_rir_maps() {
    if [[ -z "${asn_mapper_tmp// }" ]]; then
        asn16_tmp==(:)
        asn32_tmp==(:)
        ripe_as_cc_tmp=(:)
        $curl -sS -o $asn16_tmp $iana_as_alloc_16
        if [ $? -ne 0 ]; then
            echo "Failed to fetch AS number assignment plan from IANA" >&2
            exit 1
        fi
        $curl -sS -o $asn32_tmp $iana_as_alloc_32
        if [ $? -ne 0 ]; then
            echo "Failed to fetch AS number assignment plan from IANA" >&2
            exit 1
        fi
        $curl -sS -o $ripe_as_cc_tmp $ripe_as_to_country
        asn_mapper_tmp==(:)
        echo "case \$asnum in" > $asn_mapper_tmp
        (<$asn16_tmp <$asn32_tmp) | while read line; do
            parts=("${(@s/,/)line}")
            as_range=$parts[1]
            alloc=$parts[2]
            whois_server=$parts[3]
            if [[ -z "${whois_server// }" ]]; then
                continue # Non-interesting allocations do not have a whois server set
            fi
            if ! [[ $as_range =~ [0-9]+ ]]; then
                continue # Filters out the header at the top of the files
            fi
            if [[ $as_range = "0-65535" ]]; then
                continue # Only present in 32-bit registry, covers whole 16-bit registry referring to it
            fi
            if [[ $as_range =~ \- ]]; then
                is_range=1 # This is a range of AS numbers
                echo "<$as_range>)"
            else
                is_range=0 # This is a single AS number allocation
                echo "$as_range)"
            fi
            echo -e "\tas_alloc=\"$alloc\""
            echo -e "\twhois_server=\"$whois_server\""
            echo ";;"
        done >> $asn_mapper_tmp
        echo "esac" >> $asn_mapper_tmp
    fi
}

asinfo_tmp==(:)

for rrd in $(echo $rrd_path/??/*); do # We only update ASINFO for ASes we've seen traffic to/from
    bits=("${(@s,/,)rrd}")
    last_bit="${bits[${#bits}]}"
    bits=("${(@s,.,)last_bit}")
    asnum=$bits[1]
    get_as_data $asnum
done > $asinfo_tmp
zf_mv $asinfo_tmp $asinfo_path
twiddern commented 7 years ago

Thanks for this, but something isn't correct working. Most country/flag information for RIPE countries are missing and if a flag is displayed, it's from a not RIPE serviced region.

crazzy commented 7 years ago

Yeah, I noticed there are still a few things to work out. I had to pause this project for a bit for other stuff (stuff that brings my employer money). But most of the data is there.

As can be seen in the script I took the RIPE country data from ftp://ftp.ripe.net/pub/stats/ripencc/delegated-ripencc-latest which I assumed would work out but apparently not. Or I have a parsing bug. Thing is RIPE doesn't reliably publish country for AS numbers in whois like the rest of the RIR's.