allinurl / goaccess

GoAccess is a real-time web log analyzer and interactive viewer that runs in a terminal in *nix systems or through your browser.
https://goaccess.io
MIT License
18.12k stars 1.1k forks source link

Cloudflare Argo Tunnel - Custom Log Format #2111

Closed UpperCenter closed 3 years ago

UpperCenter commented 3 years ago

Hi all,

I've been using Goaccess for a few years now, and have recently started using Cloudflare Argo tunnels as part of a more positive security model.

Using the Argo tunnel breaks some stuff, namely all inbound connections look like they come from localhost. I've fixed this by using a custom log format, but I'm not sure how to translate that into Goaccess. I use Realtime HTML reports.

Log format:

# Custom Argo Tunnel Logging
log_format ArgoCustom '$http_x_forwarded_for $remote_user [$time_local] $request '
'"$status" $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_cf_ipcountry" ';

Example Log:

1.2.3.4 - [16/May/2021:08:56:48 -0400] GET /images/icons/discord.png HTTP/1.1 "200" 885 "https://example.com/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:88.0) Gecko/20100101 Firefox/88.0" "GB"
1.2.3.4 - [16/May/2021:08:56:48 -0400] GET /images/eggs/cactusEgg.png HTTP/1.1 "200" 339 "https://example.com/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:88.0) Gecko/20100101 Firefox/88.0" "GB"
1.2.3.4 - [16/May/2021:08:56:48 -0400] GET /images/icons/background2.png HTTP/1.1 "200" 1824900 "https://example.com/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:88.0) Gecko/20100101 Firefox/88.0" "GB"
1.2.3.4 - [16/May/2021:08:56:48 -0400] GET /images/icons/thecastles.png HTTP/1.1 "200" 129575 "https://example.com/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:88.0) Gecko/20100101 Firefox/88.0" "GB"
1.2.3.4 - [16/May/2021:08:56:48 -0400] GET /images/icons/patreon.png HTTP/1.1 "200" 736 "https://example.com/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:88.0) Gecko/20100101 Firefox/88.0" "GB"

I currently run Goaccess as a service using:

ExecStart=goaccess -f /var/log/nginx/example.com.access.log \
          --real-time-html --ws-url=wss://rt.example.com:443/ws \
          -o /srv/FoM/Reports/report.html --port=7890 \
          --log-format=COMBINED \
          --origin=https://rt.example.com

Any pointers would be appreciated. Cheers

allinurl commented 3 years ago

This should do it:

goaccess access.log --log-format='%h %^[%d:%t %^] %m %U %H "%s" %b "%R" "%u" %^' --date-format=%d/%b/%Y --time-format=%T

You could try enclosing the request in quotes in case the request is not properly encoded. So instead of looking like this:

GET /images/icons/discord.png HTTP/1.1 

it would look like

"GET /images/icons/discord.png HTTP/1.1"

If you do that, then you would use this format:

goaccess access.log --log-format='%h %^[%d:%t %^] "%r" "%s" %b "%R" "%u" %^' --date-format=%d/%b/%Y --time-format=%T

UpperCenter commented 3 years ago

Thanks for that, I missed the request format error.

I'm able to run:

goaccess -f /var/log/nginx/example.com.access.log \
          --real-time-html --ws-url=wss://rt.example.com:443/ws \
          -o /srv/FoM/Reports/report.html --port=7890 \
          --log-format='%h %^[%d:%t %^] "%r" "%s" %b "%R" "%u" %^' --date-format=%d/%b/%Y --time-format=%T \
          --origin=https://rt.example.com

In the terminal with no issues, but running as a service I get this error, is there a flag I'm missing?

May 16 10:24:50 example goaccess[544]:  [PARSING /var/log/nginx/example.com.access.log] {0} @ {0/s}
May 16 10:24:50 example goaccess[544]: ==544== GoAccess - Copyright (C) 2009-2020 by Gerardo Orellana
May 16 10:24:50 example goaccess[544]: ==544== https://goaccess.io - <hello@goaccess.io>
May 16 10:24:50 example goaccess[544]: ==544== Released under the MIT License.
May 16 10:24:50 example goaccess[544]: ==544==
May 16 10:24:50 example goaccess[544]: ==544== FILE: /var/log/nginx/example.com.access.log
May 16 10:24:50 example goaccess[544]: ==544== Parsed 10 lines producing the following errors:
May 16 10:24:50 example goaccess[544]: ==544==
May 16 10:24:50 example goaccess[544]: ==544== Token '00"' doesn't match specifier '%s'
May 16 10:24:50 example goaccess[544]: ==544== Token '00"' doesn't match specifier '%s'
May 16 10:24:50 example goaccess[544]: ==544== Token '00"' doesn't match specifier '%s'
May 16 10:24:50 example goaccess[544]: ==544== Token '00"' doesn't match specifier '%s'
May 16 10:24:50 example goaccess[544]: ==544== Token '00"' doesn't match specifier '%s'
May 16 10:24:50 example goaccess[544]: ==544== Token '00"' doesn't match specifier '%s'
May 16 10:24:50 example goaccess[544]: ==544== Token '00"' doesn't match specifier '%s'
May 16 10:24:50 example goaccess[544]: ==544== Token '00"' doesn't match specifier '%s'
May 16 10:24:50 example goaccess[544]: ==544== Token '00"' doesn't match specifier '%s'
May 16 10:24:50 example goaccess[544]: ==544== Token '00"' doesn't match specifier '%s'
May 16 10:24:50 example goaccess[544]: ==544==
May 16 10:24:50 example goaccess[544]: ==544== Format Errors - Verify your log/date/time format
May 16 10:24:50 example systemd[1]: goaccess.service: Main process exited, code=exited, status=1/FAILURE
allinurl commented 3 years ago

Did you change the request to be within quotes? Seems like it's parsing lines that may not be within quotes. If you did and those are old lines, you could try using --num-tests=0

UpperCenter commented 3 years ago

I did change the request to be within quotes, logs now look like this:

1.2.3.4 - [16/May/2021:13:26:28 -0400] "GET /image/zefk7.png?1621185985 HTTP/1.1" "200" 1486 "https://example.com/hold" "Mozilla/5.0 (Linux; Android 10; SM-T290) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.210 Safari/537.36" "US"
1.2.3.4 - [16/May/2021:13:26:28 -0400] "GET /image/LDGp8.png?1621185985 HTTP/1.1" "200" 1001 "https://example.com/hold" "Mozilla/5.0 (Linux; Android 10; SM-T290) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.210 Safari/537.36" "US"
1.2.3.4 - [16/May/2021:13:26:28 -0400] "GET /image/HNIvt.png?1621185984 HTTP/1.1" "200" 1001 "https://example.com/hold" "Mozilla/5.0 (Linux; Android 10; SM-T290) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.210 Safari/537.36" "US"
1.2.3.4 - [16/May/2021:13:26:28 -0400] "GET /image/ULB2P.png?1621185985 HTTP/1.1" "200" 1001 "https://example.com/hold" "Mozilla/5.0 (Linux; Android 10; SM-T290) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.210 Safari/537.36" "US"
1.2.3.4 - [16/May/2021:13:26:28 -0400] "GET /image/iFqIu.png?1621185984 HTTP/1.1" "200" 1001 "https://example.com/hold" "Mozilla/5.0 (Linux; Android 10; SM-T290) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.210 Safari/537.36" "US"

Edit: It might be because I missed the quotes surrounding $body_bytes_sent

allinurl commented 3 years ago

For those five lines, this works for me:

goaccess access.log --log-format='%h %^[%d:%t %^] "%r" "%s" %b "%R" "%u" %^' --date-format=%d/%b/%Y --time-format=%T
UpperCenter commented 3 years ago

It's odd, I only get the issue when trying to restart the systemd service.

Running:

goaccess -f /var/log/nginx/example.com.access.log \
            --real-time-html --ws-url=wss://rt.example.com:443/ws \
            --exclude-ip=1.2.3.4 --exclude-ip=5.6.7.8 \
            -o /srv/FoM/Reports/report.html --port=7890 \
            --log-format='%h %^[%d:%t %^] "%r" "%s" %b "%R" "%u" %^' --date-format=%d/%b/%Y --time-format=%T \
            --origin=https://rt.example.com --daemonize

works fine, but as soon as I use that string in my service,

[Unit]
Description=GoAccess real-time web log analysis
After=network.target

[Service]
Type=simple

ExecStart=goaccess -f /var/log/nginx/example.com.access.log \
            --real-time-html --ws-url=wss://rt.example.com:443/ws \
            --exclude-ip=1.2.3.4 --exclude-ip=5.6.7.8 \
            -o /srv/FoM/Reports/report.html --port=7890 \
            --log-format='%h %^[%d:%t %^] "%r" "%s" %b "%R" "%u" %^' --date-format=%d/%b/%Y --time-format=%T \
            --origin=https://rt.example.com --daemonize

ExecStop=/bin/kill -9 ${MAINPID}
WorkingDirectory=/tmp

[Install]
WantedBy=multi-user.target

Doesn't work anymore, with the Token '00"' doesn't match specifier '%s' error.

allinurl commented 3 years ago

Have you upgraded to a different version, or is it possible you have two versions installed? Can you please try using the full path? e.g., /usr/local/bin/goaccess or wherever the path/location may be.

UpperCenter commented 3 years ago

I've only got one version installed, from the https://deb.goaccess.io repo. It's in /usr/bin/goaccess. It didn't make a difference though. I'm using GoAccess - 1.4.6.

allinurl commented 3 years ago

Odd. Somehow it sounds like it's trying to parse a mixed log, thus the error matching the 00" referring to "200". You could try using --num-tests=0 and see if that helps.

UpperCenter commented 3 years ago

Thanks, I tried that and get:

/etc/systemd/system/goaccess.service:13: Failed to resolve unit specifiers in --log-format=%h %^[%d:%t %^] "%r" "%s" %b "%R" "%u" %^: Invalid slot
allinurl commented 3 years ago

To be honest sounds like an issue isolated to your environment, there's no "Invalid slot" nor "Failed to resolve" error messages anywhere in goaccess' code base. I'd check possible issues with systemd.

UpperCenter commented 3 years ago

You're right, it seems like systemd doesn't like one of the characters used when specifying the log format. I added these options to my global config file in /etc/goaccess/goaccess.conf/ but I'm still getting a parse error

######################################
# Time Format Options (required)
######################################
time-format=%T

######################################
# Date Format Options (required)
######################################
date-format=%d/%b/%Y

######################################
# Log Format Options (required)
######################################
log-format='%h %^[%d:%t %^] "%r" "%s" %b "%R" "%u" %^'
GoAccess - version 1.4.6 - Mar  1 2021 03:28:39
Config file: /etc/goaccess/goaccess.conf

Fatal error has occurred
Error occurred at: src/settings.c - parse_conf_file - 288
Malformed config key at line: 4
0bi-w6n-K3nobi commented 3 years ago

Hi @UpperCenter .

Into goaccess.conf the format option is little bit different. Do need use space instead of = and no quotes. I.E.: time-format %T date-format %d/%b/%Y log-format %h ...

I hope that I helped.

UpperCenter commented 3 years ago

Thanks @0bi-w6n-K3nobi, I get this error now:

==7125== Token '/image/yhvMe.png' doesn't match specifier '%s'
==7125== Token '/image/ePiNq.png' doesn't match specifier '%s'
==7125== Token '/image/HORtR.png' doesn't match specifier '%s'
==7125== Token '/image/8lxdL.png' doesn't match specifier '%s'
==7125== Token '/image/nRpjT.png' doesn't match specifier '%s'
==7125== Token '/image/JySew.png?1621382010' doesn't match specifier '%s'
==7125== Token '/encyclopedia/specie/385' doesn't match specifier '%s'
==7125== Token '/' doesn't match specifier '%s'
==7125== Token '/view/GpEfz' doesn't match specifier '%s'
==7125== Token '/view/FZC6o' doesn't match specifier '%s'
0bi-w6n-K3nobi commented 3 years ago

Hi @UpperCenter

I didn't express myself well ... I'm sorry. No quotes == no ' character around the format. I.E.: log-format %h %^[%d:%t %^] "%r" "%s" %b "%R" "%u" %^

UpperCenter commented 3 years ago

Thanks guys, it's all working fine now!