intelowlproject / IntelOwl

IntelOwl: manage your Threat Intelligence at scale
https://intelowlproject.github.io
GNU Affero General Public License v3.0
3.83k stars 438 forks source link

Maxmind analyzer adjustment #2162

Closed mlodic closed 5 months ago

mlodic commented 8 months ago

WE need to add the download of the ASN database too. Right now we are only downloading the City and the Country DB.

See https://dev.maxmind.com/geoip/geolite2-free-geolocation-data for more info

Plus, we should change the old URLs that we are using to download the DBs. Now there is a easier way to download them: see: https://dev.maxmind.com/geoip/updating-databases#directly-downloading-databases Info about these updates: https://support.maxmind.com/hc/en-us/articles/4408216129947-Download-and-Update-Databases Screenshot from the user page: image

mlodic commented 8 months ago

it would also make sense to move from the actual library to the official one supported by maxmind:https://pypi.org/project/geoip2/

error9098x commented 8 months ago

Hi @mlodic I wanted to confirm that this newer directly download method require HTTP Basic Authentication which is different that the old method. So, my approach was to use requests module in python to get the directly download working, please tell me if I am wrong.

here I created a repl for the same https://replit.com/@Aviral1010/PristineAgonizingScriptinglanguages#main.py

import requests
import os

YOUR_ACCOUNT_ID = os.environ['Account_ID']
YOUR_LICENSE_KEY = os.environ['License_Key']
db_names = ["GeoLite2-Country.mmdb", "GeoLite2-City.mmdb","GeoLite2-ASN.mmdb"]
for db in db_names:
  DOWNLOAD_URL = f'https://download.maxmind.com/geoip/databases/{db[:-5]}/download?suffix=tar.gz'

  response = requests.get(
      DOWNLOAD_URL,
      auth=(YOUR_ACCOUNT_ID, YOUR_LICENSE_KEY)
  )
  if response.ok:
      filename = f'{db[:-5]}.tar.gz' 
      with open(filename, 'wb') as file:
          file.write(response.content)
      print(f'Download completed: {filename}')
  else:
      print(f'Failed to download the file: HTTP {response.status_code}')
mlodic commented 8 months ago

yeah that can work but I would maybe opt to integrate their library to be sure that these methods won't change again under our nose if possible. It's easier for maintainability I think

error9098x commented 8 months ago

Yes, I was looking into their library, currently reading the documentation and finding a way to integrate it with the code. If it work, it will be definitely better.

error9098x commented 8 months ago

yeah that can work but I would maybe opt to integrate their library to be sure that these methods won't change again under our nose if possible. It's easier for maintainability I think

I have noticed that the new method requires an 'account_id' in addition to the 'license_key', but I am uncertain about how to obtain the 'account_id'. The geoip2 library's web service functions only for 'city' and 'country' databases. Unfortunately, for the 'asn' database, we must download it and then access it using geoip2.database. Therefore, we can update our approach for 'city' and 'country' lookups, but we must continue using the older method for the 'asn' data; that's what I believe.

christophermluna commented 8 months ago

hey there! this is christopher luna, i'm the product manager at MaxMind who handles GeoIP stuff. just wanted to chime in to let y'all know that existing integration methods will still work to download databases. even though we have new endpoints and examples in our developer documentation, i don't think it's necessary to change the endpoint you're currently using.

are you updating your method because you're concerned about the new r2 presigned URLs that we're redirecting to?

mlodic commented 8 months ago

hey @christophermluna thanks for helping. Quite frankly the integration that we have in IntelOwl with Maxmind is quite old and it could reuse a refactor anyway. That was the main reason. I did not find any issue in production environments yet. Consider that you are in this thread, can we tag you in case we need clarifications? thanks again

mlodic commented 8 months ago

@error9098x I noticed now that the library provides access to the web services that are limited to country and city db as you mentioned. Because of this I think we could just download the databases and keep it locally like we are still doing and just add the ASN database and update the endpoints with the new ones.

christophermluna commented 8 months ago

@mlodic sure thing! if y'all run into any issues or questions, i'm happy to help.

just want to let you know that if changing the endpoint creates downstream issues for your users there would be no need to do so.

the most recent changes around download links are:

please do feel free to ping me if y'all have any questions or confusion

mlodic commented 8 months ago

really appreciated the clarification!

please do feel free to ping me if y'all have any questions or confusion

sure! :)

error9098x commented 8 months ago

@error9098x I noticed now that the library provides access to the web services that are limited to country and city db as you mentioned. Because of this I think we could just download the databases and keep it locally like we are still doing and just add the ASN database and update the endpoints with the new ones.

Sure, just to clarify changing the existing endpoint requires account_id as an additional parameter. I am thinking to make a _get_account_id function quite similar to _get_api_key function, which I think is the license key. Is that the correct approach ?

  @classmethod
    def _get_api_key(cls) -> Optional[str]:
        for plugin in PluginConfig.objects.filter(
            parameter__python_module=cls.python_module,
            parameter__is_secret=True,
            parameter__name="api_key_name",
        ):
            if plugin.value:
                return plugin.value
        return None
   def _get_account_id(cls) -> Optional[str]:
        for plugin in PluginConfig.objects.filter(
            parameter__python_module=cls.python_module,
            parameter__is_secret=True,
            parameter__name="account_id_name",
        ):
            if plugin.value:
                return plugin.value
        return None
mlodic commented 5 months ago

solved by #2282