brunoerg / asmapy

Asmap stuff for Bitcoin Core
MIT License
13 stars 9 forks source link

Add custom data and augment the RIPE data with that #7

Open brunoerg opened 1 year ago

brunoerg commented 1 year ago

cc: @dunxen

dunxen commented 1 year ago

Thanks! Yeah so in addition to the RIPE data, we'd also want users to be able to supply their own router dump for creating maps (maybe via some CLI argument to the appropriate command(s)).

brunoerg commented 1 year ago

Nice!

Questions:

  1. User has its own router dump in what format? Same format when we download RIPE data, so we want to merge them?
  2. User wants to supply their whole router dump or only part of that?
dunxen commented 1 year ago

Yes, I was thinking of just being able to merge their whole dump (same format as RIPE) and using some strategy to resolve conflicts (maybe preferring the custom dump). If that makes sense?

brunoerg commented 1 year ago

Yes, make sense! I'm gonna work on it!

Mistersx12 commented 1 year ago

Here's an example of how you might implement this using a dictionary to store the custom data and a list to store the augmented routes and their AS paths

import os
from bgpdumpy import BGPDump, TableDumpV2

def parse(dir, custom_data, all_asn=False):
    routes = []
    for entry in os.scandir(dir):
        if entry.is_file():
            with BGPDump(entry.path) as bgp:
                print(f"Processing {entry.name}...")
                for entry in bgp:
                    if not isinstance(entry.body, TableDumpV2):
                        continue

                    prefix = '%s/%d' % (entry.body.prefix, entry.body.prefixLength)
                    if not all_asn:
                        list_ASN = set([
                            route.attr.asPath.split()[-1]
                            for route
                            in entry.body.routeEntries])

                        for item in list(list_ASN):
                            as_path = f'AS{item}'
                            if prefix in custom_data:
                                as_path += f' {custom_data[prefix]}'
                            routes.append(f'{prefix} {as_path}\n')
                    else:
                        list_ASN = set([
                            route.attr.asPath
                            for route
                            in entry.body.routeEntries])

                        for item in list(list_ASN):
                            as_path = item
                            if prefix in custom_data:
                                as_path += f' {custom_data[prefix]}'
                            routes.append(f'{prefix}|{as_path}\n')

    if not os.path.exists(f"paths-{dir}"):
        os.mkdir(f"paths-{dir}")
    with open(f'paths-{dir}/routes.txt', 'w') as w_file:
        w_file.writelines(routes)

# Load custom data from a file or database
custom_data = {
    '10.0.0.0/8': 'AS1 AS2 AS3',
    '172.16.0.0/12': 'AS4 AS5 AS6',
    ...
}

# Call the parse function with the custom data
parse('ripedump', custom_data, all_asn=False)

the custom_data dictionary contains the prefixes and their corresponding AS paths that you want to add to the routes.

It's up to the user to decide whether to supply their whole router dump or only a part of it. If the user has a large router dump, it may be more efficient to process only a subset of the data that is relevant to their analysis. Alternatively, the user could use a tool such as 'bgpdumpy' to filter the data in their router dump before processing it.

brunoerg commented 1 year ago

@Mistersx12 Good example, but you're not considering both files are in the same format I guess. But it can be a good start! :)