allinurl / goaccess

GoAccess is a real-time web log analyzer and interactive viewer that runs in a terminal in *nix systems or through your browser.
https://goaccess.io
MIT License
17.96k stars 1.09k forks source link

Would GoAccess be capable to analyze a webmail logs generated with RoundCube? #1210

Open agam8414 opened 5 years ago

agam8414 commented 5 years ago

Hi everyone! I have a webmail log generated with RoundCube and I have to accomplish the mission to generate an insight of what concrete actions have been made in each session for a determined IP accessing a certain email account.

Would GoAccess be capable to do the job??

Many thanks in advance, and Best Regards, Alex

PD: By creating a new dump email account "test", for which the log is empty initially, and performing some basic tasks, I've been able to reach these fresh writed lines onto the empty log, for each action. For instance:


Writing a new mail:

94.X.Y.84 - test%40domain.com [08/21/2018:19:45:55 -0000] "GET /cpsess1192113653/3rdparty/roundcube/?_task=mail&_mbox=INBOX&_action=compose HTTP/1.1" 302 0 "https://cp5024.company.eu:2096/cpsess1192113653/3rdparty/roundcube/?_task=mail&_mbox=INBOX" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36" "s" "-" 2096
94.X.Y.84 - test%40domain.com [08/21/2018:19:46:10 -0000] "GET /cpsess1192113653/3rdparty/roundcube/?_task=mail&_action=compose&_id=1594220085b7c6c0280ed1 HTTP/1.1" 200 0 "https://cp5024.company.eu:2096/cpsess1192113653/3rdparty/roundcube/?_task=mail&_mbox=INBOX" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36" "s" "-" 2096

Accessing a mail:

94.X.Y.84 - test%40domain.com [08/21/2018:19:47:30 -0000] "GET /cpsess1192113653/3rdparty/roundcube/?_task=mail&_caps=pdf%3D1%2Cflash%3D0%2Ctiff%3D0%2Cwebp%3D0&_uid=2&_mbox=INBOX&_action=show HTTP/1.1" 200 0 "https://cp5024.company.eu:2096/cpsess1192113653/3rdparty/roundcube/?_task=mail&_mbox=INBOX" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36" "s" "-" 2096
94.X.Y.84 - test%40domain.com [08/21/2018:19:47:45 -0000] "GET /cpsess1192113653/3rdparty/roundcube/?_task=addressbook&_action=photo&_email=test%40domain.com HTTP/1.1" 200 0 "https://cp5024.company.eu:2096/cpsess1192113653/3rdparty/roundcube/?_task=mail&_caps=pdf%3D1%2Cflash%3D0%2Ctiff%3D0%2Cwebp%3D0&_uid=2&_mbox=INBOX&_action=show" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36" "s" "-" 2096
94.X.Y.84 - test%40domain.com [08/21/2018:19:47:46 -0000] "GET /cpsess1192113653/3rdparty/roundcube/?_task=mail&_action=pagenav&_uid=2&_mbox=INBOX&_remote=1&_unlock=loading1534880865585&_=1534880865305 HTTP/1.1" 200 0 "https://cp5024.company.eu:2096/cpsess1192113653/3rdparty/roundcube/?_task=mail&_caps=pdf%3D1%2Cflash%3D0%2Ctiff%3D0%2Cwebp%3D0&_uid=2&_mbox=INBOX&_action=show" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36" "s" "-" 2096
94.X.Y.84 - test%40domain.com [08/21/2018:19:47:46 -0000] "GET /cpsess1192113653/3rdparty/roundcube/?_task=mail&_action=getunread&_remote=1&_unlock=0&_=1534880865306 HTTP/1.1" 200 0 "https://cp5024.company.eu:2096/cpsess1192113653/3rdparty/roundcube/?_task=mail&_caps=pdf%3D1%2Cflash%3D0%2Ctiff%3D0%2Cwebp%3D0&_uid=2&_mbox=INBOX&_action=show" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36" "s" "-" 2096

Erasing a mail:

94.X.Y.84 - test%40domain.com [08/21/2018:19:49:00 -0000] "POST /cpsess1192113653/3rdparty/roundcube/?_task=mail&_action=move HTTP/1.1" 200 0 "https://cp5024.company.eu:2096/cpsess1192113653/3rdparty/roundcube/?_task=mail&_mbox=INBOX" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36" "s" "-" 2096

allinurl commented 5 years ago

Question, are you essentially trying to group these actions? You probably need to preprocess the log with tools such as awk, grep, sed, etc and create specific reports from goaccess to display that filtered data.

agam8414 commented 5 years ago

For sure, I can preprocess the log with awk, grep, sed, etc. In fact, this is the first thing I've done in order to narrow down the number of columns and isolate just colums that help me to identify "which-IP", "what date/time" and "which-Operative System". But, after analyze that, the goal is sligthly different: I need to determine what actions were performed by a concrete actor (i.e.: a concrete IP). Data of such actions are inside GET requests. And it seems that one standard action for us (humans) like "accessing an email" becomes 4 sub-actions inside the log.

So, for me, it would be difficult to identify each standard action if I have to do it by myself with commands and at a naked-eyes.

Another thing that complex further the goal is that each GET request is almost unreadable to do it at a naked-eyes.

The thing is that I found this tool and thought it would help me out doing the hard work. So, can it help me out somehow? Basically, what I need GoAccess identify for me is something like this:

94.X.Y.84 did 7 sessions. Session '#1' was at this Date/Time, and performed these actions: open email#1, download attached documents, marked email#1 as unread, ....., finally close the session at this Date/Time Session '#2' was at .....

allinurl commented 5 years ago

Got it. Unfortunately at the moment is not possible. However, I plan to address this (or something close to it) in #117. Thanks for the explanation though, I'll keep this in mind.

agam8414 commented 5 years ago

Yeah, Thank you so much for your time, anyway. Please, keep me posted when you go further developing this issue, and please, don't hesitate in asking me help in order to accomplish this goal. I'll be glad to help you. Just in case you need it, here you have the last command I used filtering the massive roundcube-log data-file (perhaps you already know how to do this, but if not, here you have an example...), below, you'll find what it does (remove "(#)" expressions if you plan to use it):

sed -n '/rip=[0-9]/p' log_maillog.txt (#1) | sed 's/\/var\/log\/maillog-[0-9]\{8\}:// (#2) ; s/imap-login:/imap-login/; s/pop3-login:/pop3-login/; s/rip=// (#3) ; s/\(\([0-9]\{1,3\}\.\)\{3\}[0-9]\{1,3\}\),/\1/' (#4) | awk '{print $10 "\t" $1, $2, $3, $6}' (#5) > mail_accesses_list.tx

I strongly believe this issue about roundcube logs is such an interesting thing to accomplish because I think there isn't out there any software performing this.

allinurl commented 5 years ago

Thanks for posting that. Please keep this open so I can take a look at it.