apache / flagon-distill

Apache Flagon Distill is a python package to support and analyze Flagon UserAle.js logs
https://flagon.apache.org/
Apache License 2.0
10 stars 14 forks source link

Add function to segment logs into user sessions #27

Closed EandrewJones closed 2 months ago

EandrewJones commented 8 months ago

What are "User Sessions"?

Most user behavior services provide some definition of a user "session" and then segment the log stream into sessions for further behavior. For example, LogRocket defines a session as:

A session is a series of user interactions on your site, beginning with the first page they visit and ending with either: a.) a period of inactivity lasting longer than 30 minutes, or b.) after the user has navigated away from your app for more than 2 minutes. This includes closing the tab or navigating to a different domain on the tab.

"Activity" is defined as any user mouse movement, clicks, or scrolls.

As an example, if your user visits your landing page, then your app, and then refreshes the page all within 30 minutes of each other without closing the tab, the entire experience is recorded in a single session. If the user returns back to your site after another hour, a new session recording starts from the moment that they do the first action.

LogRocket sessions also support recording across multiple tabs, so a user opening a link in your app in a new tab will count as the same session. This means that if your app is running in multiple tabs, each tab would need to be navigated away from in order to end a session after 2 minutes. Otherwise, it wouldn't end until a period of inactivity across all tabs lasting longer than 30 minutes.

Why do we need "User Sessions"?

Sessions are a particularly useful unit by which to analyze user behavior since they represent a logical clustering of activity. Answers to simple questions such as:

Proposed change

We should add a method that segregates the entire log stream into appropriate session buckets according to some definition of a "user session." It need not necessarily be the LogRocket definition shared above; however, I am proposing that as a reasonable starting point.

Jyyjy commented 7 months ago

A key function for this already exists, detect_deadspace.

https://github.com/apache/flagon-distill/blob/9aafe6aace33d602cfcc9e509a6eb2126d1fa8a9/distill/segmentation/segment.py#L408

Doing a simple URL filter followed by a detect_deadspace call should produce user session segments as described.