ratibor78 / geostat

GeoStat, Python script for parsing Nginx and Apache logs files and getting GEO data from incoming IP's.
MIT License
69 stars 20 forks source link
apache geo geol geoparser geostats grafana influxdb logpa logs nginx nginx-proxy python statistics

GeoStat

Version 2.3

Alt text

GeoStat a Python-based script for parsing Nginx and Apache log files and getting GEO data about incoming IPs from them. This script converts parsed data into JSON format and sends it to the InfluxDB database, so you can use it for building nice Grafana dashboards. Now, this program supports old InfluxDB 1.8 and modern InfluxDB 2. The application runs as SystemD service and parses log files in "tailf" style. Also, you can run it as a Docker container if you wish.

New in version 2.3

New in version 2.2

New in version 2.0

Main Features:

JSON format that script sends to InfluxDB looks like:

[
    {
        'fields': {
            'count': 1
        },
        'measurement': 'geo_cube',
        'tags': {
            'host': 'cube'
            'website': 'website.com'
            'geohash': 'u8mb76rpv69r',
            'country_code': 'UA'
            'country_name': 'Ukraine'
            'city_name': 'Odessa'
        }
     }
]

As you can see there are six fields in the JSON output, so you can build dashboards using geo-hash (with a point on the map) or country code, or with the country name and city name. Build dashboards with variables based on the hostname tag or combine them all. A count for any metric equals 1, so it'll be easy summarising. This script doesn't parse the log file from the beginning but parses it line by line after starting. So you can build dashboards using count of data after some time will pass.

You can find the example of the Grafana dashboard in geomap.json file or take it from the grafana.com: https://grafana.com/dashboards/8342

Tech

GeoStat uses a number of open source libs to work properly:

Important

The GeoLite2-City database no longer available for the simple downloading, now you need register on the maxmind.com website first. After you'll get an account on the maxmind.com you can find the needed file by the link

(https://www.maxmind.com/en/accounts/YOURACCOUNTID/geoip/downloads)

Please don't forget to unzip and put the GeoLite2-City.mmdb file in the same directory with the geoparse.py script, or you can put it anywhere and then fix the path in the settings.ini.

Installation

You can install it in a few ways:

Using install.sh script: 1) Clone the repository. 2) CD into the directory and then run install.sh, it will asks you to set properly settings.ini parameters, like Nginx/Apache access.log path, and InfluxDB settings.
3) After the script will finish the application installationion you need copy the GeoLite2-City.mmdb file into the application local directory and start the SystemD service with systemctl start geostat.service.

Manually: 1) Clone the repository, create an environment and install requirements

$ cd geostat
$ python3 -m venv venv && source venv/bin/activate
$ pip3 install -r requirements.txt

2) Modify settings.ini & geostat.service files and copy service to systemd.

$ cp settings.ini.back settings.ini
$ vi settings.ini
$ cp geostat.service.template geostat.service
$ vi geostat.service
$ cp geostat.service /lib/systemd/system/

3) Register and download latest GeoLite2-City.mmdb file from maxmind.com

$ cp ./any_path/GeoLite2-City.mmdb ./

4) Then enable and start service

$ systemctl enable geostat.service
$ systemctl start geostat.service

Using Docker image: 1) Build the docker image using the Dockerfile inside geostat repository directory:

$ docker build -t some-name/geostat .

2) Register and download latest GeoLite2-City.mmdb file from maxmind.com

$ cp ./any_path/GeoLite2-City.mmdb ./

3) After Docker image will be created you can run it using properly edited settings.ini file and you also, need to forward the Nginx/Apache logfile inside the container:

docker run -d --name geostat -v /opt/geostat/settings.ini:/settings.ini -v /var/log/nginx_access.log:/var/log/nginx_access.log some-name/geostat

After the first metrics will reach the InfluxDB you can create nice dashboards in Grafana.

Have fun !

License

MIT

Free Software, Hell Yeah!