linagora / tmail-backend

GNU Affero General Public License v3.0
42 stars 22 forks source link

[MU] List inactive users #1298

Open quantranhong1999 opened 2 weeks ago

quantranhong1999 commented 2 weeks ago

Why

MU requires us the ability to report the inactive users list for example the users who have not used our services in the last 3 months.

We need to record the users's login activity.

Today we have user login activity using AuditTrail with login logs stored in Loki. However upon large data size (TMail produces a lot of logs), Loki strips a part of the logs to give an in-time response, which makes the users activity analysis not reliable.

We need to store that piece of information somewhere else rather than Loki.

How

Storage

Redis seems to be the suitable storage for the need: famous for session data use case, fast write, AOF seems reasonable for data backup (could accept a bit of data lag for user activity use case).

Code change

We likely could override Authenticator e.g. write a TMailUserRepositoryAuthenticator that injects UserRepositoryAuthenticator with the Redis part to publish the user login information.

For Redis data structure, we can likely use SET:

We would need a webadmin route to list:

We would need to have a module chooser to enable the user login recording module.

DoD

Integration tests + docs.

Could be another task to deploy the work on MU after careful testing.

chibenwa commented 2 weeks ago

key could be the lastConnectionDate/{username}

chibenwa commented 2 weeks ago

=> We would need a webadmin route

Please detail the webadmin route

chibenwa commented 2 weeks ago

Cc @guimard

quantranhong1999 commented 2 weeks ago

Please detail the webadmin route

Could be something like:

curl -XGET /reports/users/usernameToBeUsed?status=inactive
would list the users who have never accessed TMail service

curl -XGET /reports/users/usernameToBeUsed?status=inactive&duration=3months
would list the users who have not accessed TMail service for the last 3 months
chibenwa commented 2 weeks ago

No lets actually just expose what is stored.

We can add further per user info as: ip, mail user agent

Often that's very usefull for audit.

We shall separate IMAP and SMTP

How about:

curl -XGET /reports/users/connection
[
 {
   "username": "btellier@linagora.com",
   "imap": {
       "lastConnectionDate": "XXXX",
       "ipAddress": "XXX",
       "userAgent": "XXX"
   },
   "smtp": {
       "lastConnectionDate": "XXXX",
       "ipAddress": "XXX",
       "userAgent": "XXX"
   }
  },
...
]

Then filters:

curl -XGET /reports/users/connection?activitySince=3month

would return a list of user active during this duration

And of course per user queries:

curl -XGET /reports/users/connection/btellier@linagora

 {
   "username": "btellier@linagora.com",
   "imap": {
       "lastConnectionDate": "XXXX",
       "ipAddress": "XXX",
       "userAgent": "XXX"
   },
   "smtp": {
       "lastConnectionDate": "XXXX",
       "ipAddress": "XXX",
       "userAgent": "XXX"
   }
  }
chibenwa commented 1 week ago

I move this into TODO as this is needed for MU

Cc @guimard for review.

Cc @PatrickPereiraLinagora for review.

hungphan227 commented 1 day ago

If the data structure is just key-value, redis could be good option. However, the json document show that it is quite more complicated. Therefore, I suggest we save data in a new cassandra table.

Arsnael commented 1 day ago

Nope no Cassandra table for this, it's the worst option really. We dont want to create yet an other table.

The description of the ticket seems clear enough to me about Redis choice.

Arsnael commented 1 day ago

For Redis data structure, we can likely use SET:

With the fact that we need to store different kind of infos (protocol, last connection date, ip, user agent), better to use either I think HSET https://redis.io/docs/latest/commands/hset/ or JSON.SET https://redis.io/docs/latest/commands/json.set/

chibenwa commented 1 day ago

I prefer json.set : will look less hacky at describing an object

Arsnael commented 1 day ago

Alright I reviewed the ticket again as the initial proposition was relying on storing only the last connection date, and not extra data like protocol, ip address or user agent. So I feel like by overriding the Authenticator we would not have those info, thus the description is outdated IMO.

However this is what I believe we could do (and if I'm wrong please correct me):

WDYT?

Also for storing last login info on Redis, do we make it mandatory (of course if we use Redis), or configurable (enable/disable)?

chibenwa commented 1 day ago

JMAP (cause I guess we need to track that too?)

No lemon do.

chibenwa commented 1 day ago

I'm against code specific to Tmail code.


Proposal 1

What's wrong with the proposal of getting those infos through the MDC / context?

I bet it would be doable. A bit hacky but we coult still rely on Authenticator

Code snippet:

org.slf4j.MDC.get("ip")
clock.instant()
org.slf4j.MDC.get("protocol")
org.slf4j.MDC.get("user-agent")

James side i might just cost os to propagate the MDC upon line countinuations and login.


Proposal 2

How about defining an API triggered upon authentication succees / failures?

We could then make it configuration-loadable with user defined code as an extension mechanism.

We could then

Arsnael commented 9 hours ago

Not too sure honestly to fully get the 2nd proposal but I suggest we talk about this and have the final decision with the team on the kanban meeting next monday, I think it fits?