allinurl / goaccess

GoAccess is a real-time web log analyzer and interactive viewer that runs in a terminal in *nix systems or through your browser.
https://goaccess.io
MIT License
17.88k stars 1.09k forks source link

SUGGESTION: goaccess 2.0 #2027

Closed danilort closed 3 years ago

danilort commented 3 years ago

Goaccess is very nice, but the html report is too static.

I suggest some changes:

How does it work. There are two components.

A process that periodically (crontab?) parse the apache log, and stores the data in the database. FIELD:

ip_address
datetime
user_agent
status_code
os
....

An application (php?) that reads the data from the database, and builds the html report. I can refine the report, using filters. It becomes easy to extract the data that interest me from the database and show it. What changes is only the sql query. FILTER:

DATE
    today 
    last week
    last month
    from date to date

STATUS CODE
    all
    only 200
    all except 200
    select
        [X] 200
        [_] 204
        [X] 404
        [_] 500

What do you think about it?

0bi-w6n-K3nobi commented 3 years ago

Hi @danilort .

What I can tell you is that we are thinking about something with DB. Well ... Your idea on CRON may be cool, but it is not functional for real-time and large sites. I myself monitor websites with 800 Mi requests per day.

The filter idea is not new, but it is always welcome. It is something that I believe to be functional and ideal. I hope that in the future it can be implemented.

Well ... I am not the ideal person to talk about it. Let's hope @allinurl himself expresses himself about this.

Thank you for your attention and commitment to assist.

allinurl commented 3 years ago

Thanks for posting these suggestions, certainly appreciated! As far as filters, #117 should address that suggestion, please take a look at it for more details (it is probably the most requested feature). It is on the works and like you said, it would be great for a v2.0.

Like @0bi-w6n-K3nobi mentioned, as of now, we're looking to add an additional storage/db with greater insertion capacity. Last year, I ran some tests with SQLite but as more records were inserted, things started to slow down. I mentioned to @0bi-w6n-K3nobi that LSM-based storage/dbs are performing pretty well so far, so more likely the additional storage would be based on a LSM.

As far as SQL, right now we can make use of filtering tools such as awk, sed, grep, etc to get a filtered static report, however, I want to make sure that users can filter without the extra complexity of needing to know SQL or any of these tools and achieve similar results.

danilort commented 3 years ago

Thanks (to both of you) for the answers and the work.

I found that Apache is able to write the log directly to mysql (or other db). (see https://escapequotes.net/save-apache-log-in-mysql-database/)

I want to try it. If it works, I just have to write a php script (with some search fields) to show the data (it's not complicated).

I am not interested in real-time, and I have few accesses. I would use the statistics out of curiosity (200), but also to spot bots (404) the server errors (500) and to understand how well my blocking strategies work (403). The filter when parsing (awk, sed, grep) is not good for me.

allinurl commented 3 years ago

Awesome. I'm closing this just because both features that you suggested are being worked on and have been suggested before.

Feel free to reopen it if needed.