allinurl / goaccess

GoAccess is a real-time web log analyzer and interactive viewer that runs in a terminal in *nix systems or through your browser.
https://goaccess.io
MIT License
18.38k stars 1.11k forks source link

OpenLiteSpeed Common log format not being detected #2117

Closed VentGrey closed 3 years ago

VentGrey commented 3 years ago

Hello and thanks for reading my issue :D

I've been experiencing some trouble reading my virtual host access log for anylisis and I've scrapped the whole web in search of answers but I could not find a forum for this tool.

I changed my virtual host log format from NCSA extended/combined log format to Common Log Format (CLF) since my results didn't show any unique visitors, 404's, time distributions or even http status codes. I was using this command: sudo goaccess vhost.access_log --log-format='"%h %l %u [%d:%T] "%r" %>s %b "%{Referer}i" "%{User-Agent}i" "%{Host}i""' --date-format=%d/%b/%Y --time-format=%T When i used NCSA ext/com.

Now, when trying to parse Common Log Format things get weird, the OpenLiteSpeed documentation shows the exact same expression as the apache documentation which is this one: "%h %l %u %t \"%r\" %>s %b" (Note the \ used to escape quotations).

But when trying to load the new common log format file I get this error:

Parsed 10 lines producing the following errors:

Token 'example.com' doesn't match specifier '%h'
Token 'example.com' doesn't match specifier '%h'

My log file looks like this (CLF):

"example.com 000.000.000.00 - - [21/May/2021:02:36:14 +0000] "GET / HTTP/2" 200 10540"

I'm using this command: sudo goaccess vhost.access_log --log-format='"%h %l %u %t "%r" %>s %b"' --date-format=%d/%b/:Y --time-format=%T which doesn't work.

What am I doing wrong here? I've checked the documentation and tried a few variants in the log format but no success yet :( is there any way to solve the missing results issue? or am I just wrong in the log format?

Thanks in advance for your response :)

allinurl commented 3 years ago

Hello,

Assuming no quotes around each line, then this should work:

sudo goaccess vhost.access_log --log-format='%v %h %^[%d:%t %^] "%r" %s %b' --date-format=%d/%b/%Y --time-format=%T

with quotes:

sudo goaccess vhost.access_log --log-format='"%v %h %^[%d:%t %^] "%r" %s %b"' --date-format=%d/%b/%Y --time-format=%T
VentGrey commented 3 years ago

The first command worked like a charm. I'm assuming I missed the %v parameter and the %^ as well. Thanks a lot for helping me solve this issue :) , I think I should have read the documentation again to see those two.

allinurl commented 3 years ago

Glad that solved the issue :)

Feel free to reopen it if needed.

VentGrey commented 3 years ago

My server logs changed (without prior notice, idk if this came with an OLS update) and now look like this:

"<ip-addr> - - [12/Jun/2021:06:10:42 +0000] "GET /route/ HTTP/1.1" 301 0"

I've tried modifying the commands above but I get stuck at parsing the date with this error :( Token '-' doesn't match specifier '%h'

Should I include the - 's in the log format?

VentGrey commented 3 years ago

Fixed it. Here is the solution if someone else comes around the same problem: --log-format='"%h - - %^[%d:%t %^] "%r" %s %b"' --date-format=%d/%b/%Y --time-format=%T