I believe it would be beneficial to store IP geolocation information in its own collection in the database.
Motivation and context
We currently rely on a MaxMind GeoIP2 database being available anywhere we use cisagov/cyhy-core to provide IP geolocation data as needed. This data is pulled from when adding hosts (with cyhy-ip or cyhy-tool) or when manually updating host geolocation data (with cyhy-geoip). This results in both inconsistent data depending on how old the GeoIP2 database is for whoever is running the command as well as stale data as the data from the IP was added is not updated unless specifically requested.
If we store IP geolocation data in its own collection we can pull from that collection when that data is needed and since it is centralized we can update it on a regular cadence inline with the source updates (MaxMind updates the databases every Tuesday and Friday).
Implementation notes
A new collection schema will need to be added to cisagov/cyhy-core. A script or Lambda will need to be created to populate this collection. We will need to use the CSV version of the database which groups addresses under CIDR network blocks. I believe we will need to use the maxmind/geoip2-csv-converter tool to convert the CSV to a usable format for querying.
💡 Summary
I believe it would be beneficial to store IP geolocation information in its own collection in the database.
Motivation and context
We currently rely on a MaxMind GeoIP2 database being available anywhere we use cisagov/cyhy-core to provide IP geolocation data as needed. This data is pulled from when adding hosts (with
cyhy-ip
orcyhy-tool
) or when manually updating host geolocation data (withcyhy-geoip
). This results in both inconsistent data depending on how old the GeoIP2 database is for whoever is running the command as well as stale data as the data from the IP was added is not updated unless specifically requested.If we store IP geolocation data in its own collection we can pull from that collection when that data is needed and since it is centralized we can update it on a regular cadence inline with the source updates (MaxMind updates the databases every Tuesday and Friday).
Implementation notes
A new collection schema will need to be added to cisagov/cyhy-core. A script or Lambda will need to be created to populate this collection. We will need to use the CSV version of the database which groups addresses under CIDR network blocks. I believe we will need to use the maxmind/geoip2-csv-converter tool to convert the CSV to a usable format for querying.
Acceptance criteria
How do we know when this work is done?