allinurl / goaccess

GoAccess is a real-time web log analyzer and interactive viewer that runs in a terminal in *nix systems or through your browser.
https://goaccess.io
MIT License
18.61k stars 1.11k forks source link

Feature Request: Alias url paths #1288

Open anthonysomerset opened 6 years ago

anthonysomerset commented 6 years ago

we are running goaccess to display stats of our public mirror server: https://stats.mirror.liquidtelecom.net

for a few directories - e.g. Videolan we service the subfolder via a specific subdomain as well e.g.

http://videolan.mirror.liquidtelecom.com/ is equivalent to http://mirror.liquidtelecom.com/videolan/

our access logs are combined and we see the different vhosts in that stats panel but we obviously get what is effectively duplicate entries in the static files section:

screenshot 2018-10-31 17 12 36

It would be great if we could map a list of alias paths in the config file that will merge them at processing time e.g.

[aliases]
/videolan/vlc/ = /vlc/,/some/other/vlc/

then goaccess would merge all the requests into one url path - in the case above /videolan/vlc/

allinurl commented 6 years ago

Thanks for the suggestion! A workaround that may do that you are looking for is:

Assuming your log contains the virtual host field. For instance:

vhost.com:80 10.131.40.139 - - [02/Mar/2016:08:14:04 -0600] "GET /shop/bag-p-20 HTTP/1.1" 200 6715 "-" "Apache (internal dummy connection)"

And you would like to append the virtual host to the request in order to see which virtual host the top urls belong to

awk '$8=$1$8' access.log | goaccess -a --log-format=VCOMBINED -