Open H1r0Sh1mA opened 9 months ago
Does this occur when you execute it directly on the machine, without using Docker?
Does this occur when you execute it directly on the machine, without using Docker?
Standalone goaccess working as intended (with minimum workload).
Got some updates:
In my initial question i didnt mention one important detail - my access log rotating every 1 hour. Its inode number does not changing Here are rotation options:
/var/log/docker/*.log {
hourly
rotate 336
copytruncate
dateext
datehourago
dateformat _%Y-%m-%d_%H%M%S
extension .log
missingok
compress
notifempty
sharedscripts
postrotate
docker exec `docker ps -q -f name=nginx_container` nginx -s reload > /dev/null 2>/dev/null || true
endscript
}
The 1st hour after goaccess started im getting an accurate stats.
At the moment the second hour ticks and the access.log has been rotated (access.log inode value is still the same) - then the deviations started to appears. Some of them: In Overall Analyzed Requests pane
On the Time Distribution pane
Hour before the access.log has been rotated: hits -> 849,513 visitors-> 48,750 TX Amount -> 11Gb
this stats are fine
this stats received in the first 20 mins of a hour and they are NOT correct
Just to know the correct values for the request count during the fresh hour: wc -l access.log
returns 213618 total requests
So im expecting to receive ~ 1,063,131 total requests instead of 6,981,924 in goaccess dashboard
Are you running this using tail -F
? Have you tried passing the log directly, for example?
# goaccess access.log -o /reports/access.log.html --real-time-html --origin=https://my.domain/ --ws-url=wss://my.domain/ws --log-format='%^ %^ %^ %^ %^ %^ %v %h %^[%d:%t %^] "%r" %s %b "%R" "%u" "%^" "%h:%^" "%T"' --time-format='%T' --date-format='%d/%b/%Y' --db-path=/internal_db --geoip-database=/geo_db/GeoIP2-Country.mmdb --persist --restore
Are you running this using
tail -F
? Have you tried passing the log directly, for example?# goaccess access.log -o /reports/access.log.html --real-time-html --origin=https://my.domain/ --ws-url=wss://my.domain/ws --log-format='%^ %^ %^ %^ %^ %^ %v %h %^[%d:%t %^] "%r" %s %b "%R" "%u" "%^" "%h:%^" "%T"' --time-format='%T' --date-format='%d/%b/%Y' --db-path=/internal_db --geoip-database=/geo_db/GeoIP2-Country.mmdb --persist --restore
in my docker container im running goaccess like this (as stated in my docker-compose file):
command: >
/app/logs/access.log
-o /app/report/access.log.html
--real-time-html
--origin=https://my.domain
--ws-url=wss://my.domain/ws
--log-format='%^ %^ %^ %^ %^ %^ %v %h %^[%d:%t %^] "%r" %s %b "%R" "%u" "%^" "%h:%^" "%T"'
--keep-last=30
--time-format='%T'
--date-format='%d/%b/%Y'
--db-path=/app/internal_db
--geoip-database=/app/geo_db/GeoIP2-Country.mmdb
--geoip-database=/app/geo_db/GeoLite2-ASN.mmdb
--ignore-panel=OS
--ignore-panel=BROWSERS
--persist
--restore
so the log goes directly to goaccess without piping
does it function properly outside of Docker when passing the log directly?
does it function properly outside of Docker when passing the log directly?
Yep, standalone goaccess (outside Docker env) working as intended (with minimum workload).
Got it. I'm just trying to pinpoint the problem as precisely as possible. So, it seems the issue arises when running via Docker or when piping data in using tail -F, right?
If you remove --persist and --restore, does the issue still occur with the real-time counters? Also, are you experiencing the same issues with terminal output? Any additional details would be appreciated so I can attempt to reproduce this on my end. Thanks!
The issue arises when running via Docker.
I did not use tail -F
as i read its more advisable to send logs direct to goaccess to avoid potential issues.
I will try to remove --persist
and --restore
options and report as soon as possible.
I will also try to check the TUI behavior.
I can provide you with the docker-compose file im using.
Thank you so much for your time.
Update
removing of --persist
and --restore
options does not help.
Here is the docker-compose file
parser:
image: allinurl/goaccess
logging:
driver: syslog
options:
tag: "goaccess_parser"
expose:
- "7890"
command: >
/app/logs/access.log
-o /app/report/acces.log.html
--jobs=2
--real-time-html
--origin=https://my.domain
--ws-url=wss://my.domain/ws
--log-format='%^ %^ %^ %^ %^ %v %h %^ %^[%d:%t %^] \"%r\" %s %b \"%R\" \"%u\" \"%^\" \"%^\" \"%T\"'
--keep-last=30
--time-format=%H:%M:%S
--date-format=%d/%b/%Y
--db-path=/app/internal_db
--geoip-database=/app/geo_db/GeoIP2-Country.mmdb
--geoip-database=/app/geo_db/GeoLite2-ASN.mmdb
--ignore-panel=OS
--ignore-panel=BROWSERS
volumes:
- ./logs:/app/logs:ro
- ./reports:/app/report
- ./parser/internal_db:/app/internal_db
- ./parser/geo_db:/app/geo_db
Unique Visitors Per Day Pane (wrong values):
CUM. T.S\MAX. T.S values looking odd to me
Time Distribution Pane (15h is ok, 16h is wrong. 16h values received in 11 minutes)
Update
Time Distribution Pane:
Hours from different days overlap each other: Say - Yesterday 13th hour has 513k hits. When 13th hour ticks today, then its value starting to add to yesterdays 13th hour (513k for yesterday + XXXk for today)
@H1r0Sh1mA, just to clarify, are you encountering two separate issues? One regarding the real-time problem with Docker and the other regarding the accuracy of the stats numbers? If so, I suggest opening a new issue for the stats so we can address them separately.
Regarding the discrepancy with the cum/max/avg time served, it seems like there might be an issue with your log format. Could you share the actual lines from your access log? That would help us investigate further. Thanks
Could you share the actual lines from your access log? That would help us investigate further.
Feb 17 12:00:04 nginx_hostname nginx_container[13971]: my.domain 109.124.233.55 - - [17/Feb/2024:12:00:04 +0000] "POST /orders/event HTTP/1.1" 200 0 "-" "-" "0.001" "10.0.4.163:80" "0.002"
the log format arguments:
--log-format='%^ %^ %^ %^ %^ %v %h %^ %^[%d:%t %^] \"%r\" %s %b \"%R\" \"%u\" \"%^\" \"%^\" \"%T\"'
escaping trails (like:
\"%r\"
) are using due to the docker-compose format style. Now im issuing it like this:command: [ "/app/logs/accesst.log", "-o", "/app/report/access.html", "--jobs=8", "--keep-last=3", "--real-time-html", "--origin=https://my.domain", "--ws-url=wss://my.domain/ws", "--log-format='%^ %^ %^ %^ %^ %v %h %^ %^[%d:%t %^] \"%r\" %s %b \"%R\" \"%u\" \"%^\" \"%^\" \"%T\"'", "--time-format=%H:%M:%S", "--date-format=%d/%b/%Y", "--date-spec=min", "--db-path=/app/internal_db", "--geoip-database=/app/geo_db/GeoIP2-Country.mmdb", "--geoip-database=/app/geo_db/GeoLite2-ASN.mmdb", "--ignore-panel=OS", "--ignore-panel=BROWSERS", "--persist", "--restore" ]
Didnt have a chance yet to test
--datetime-format='%d/%b/%Y:%H:%M:%S %z'
instead of--time-format=%H:%M:%S
and--date-format=%d/%b/%Y
just to clarify, are you encountering two separate issues? One regarding the real-time problem with Docker and the other regarding the accuracy of the stats numbers?
I am experiencing all these issues in docker environment with real-time parsing mode. That why i have put them all here.
To avoid confusion i will open issues one by one separately.
no updates so far?
Some questions here...
It seems like there might be an issue with your logrotate configuration. Can you verify how it's performing?
Can you also share a simple way to reproduce this issue? I need the most basic command that triggers this issue so that we can narrow down the parameters that might not be causing it. e.g.,
/app/logs/access.log
-o /app/report/access.log.html
--real-time-html
--log-format='%^ %^ %^ %^ %^ %^ %v %h %^[%d:%t %^] "%r" %s %b "%R" "%u" "%^" "%h:%^" "%T"'
--time-format='%T'
--date-format='%d/%b/%Y'
tail -F access.log | docker run -p 7890:7890 --rm -i -e LANG=$LANG allinurl/goaccess -a -o html --log-format COMBINED --real-time-html - > report.html
GoAccess ver 1.9.1 Env: docker container Command:
Problem: During the real-time parsing goaccess re-parsing the entire log file every time a new record is added, causing it to count all the requests and stats again, including those that were already counted in previous parsing iteration.
Describing in more details: we have 1 record in access log (total 1 line) -> report returns 1 total/valid request (all stats seems fine) adding 1 record to access log (total 2 lines) -> report returns 3 total/valid/ requests (goaccess read the log file from the beginning and adding the 1st and the 2d records to the previous iteration) adding 1 record to access log (total 3 lines) -> report returns 6 total/valid requests (goaccess read the log file from the beginning again and adding all the 3 records to previous iteration) etc.....
The one shot (no real-time monitoring) execution works as intended.