secondlife / jira-archive

2 stars 0 forks source link

[BUG-229298] Internal detection of anomalies regarding concurrent residents #7198

Open sl-service-account opened 4 years ago

sl-service-account commented 4 years ago

How would you like the feature to work?


# This works, it detects sudden drops in users by watching a threshold.
# 2000 seems to be the sweet spot, may see false positives below that point.
def isUnusualDropSudden(last, now, test = 2000):
    # This works because subtracting a high value from the low value results
    # in a negetive value, but a low value from a high value results in positive
    # therefore, we will only see unusual drops, not unusual gains.
    if now - last > test:
        print("Sudden drop greater than {} detected. Difference: -{}".format(test, now-last))
        return True
    return False

Statistics should be passed as the "test" value. That being, average drop in users when there are not issues, such checking the max drop in the past 24 hours. This will prevent needing to reconfigure the script. Example method would be finding the last location where logins begin to drop, and sampling until they begin to rise again, then putting the data through something like:


def getMaxDrops(samples):
    last = 0
    drops = [last]
    for sample in samples:
        if sample < last:
            drops.append(last-sample)
        last = sample
    return max(drops)

Ideally, a mechanism that detects Unusual gradual drops(I tried, couldn't figure out how to get signed differences to work) would also be in place. The above Sudden detection code would only detect network issues, a gradual detection would detect login server outages or similar. Both would be needed for best detection as sudden drop detection wouldn't detect slow but unusual logout rate of residents.

Why is this feature important to you? How would it benefit the community?

Something like this, if not already implemented, can work as a warning system if something has gone wrong with the network, login server, or otherwise.

Example of the code in action: statistics.png (2000x256) Blue is "pre sample" data, to emulate a history. White is normal change. Red is "Unusual" change.

Attachments

Original Jira Fields | Field | Value | | ------------- | ------------- | | Issue | BUG-229298 | | Summary | Internal detection of anomalies regarding concurrent residents | | Type | New Feature Request | | Priority | Unset | | Status | Accepted | | Resolution | Unresolved | | Reporter | Chaser Zaks (chaser.zaks) | | Created at | 2020-09-01T14:33:54Z | | Updated at | 2020-09-02T18:48:09Z | ``` { 'Build Id': 'unset', 'Business Unit': ['Platform'], 'Date of First Response': '2020-09-02T13:48:09.448-0500', 'How would you like the feature to work?': '{code:python}\r\n# This works, it detects sudden drops in users by watching a threshold.\r\n# 2000 seems to be the sweet spot, may see false positives below that point.\r\ndef isUnusualDropSudden(last, now, test = 2000):\r\n # This works because subtracting a high value from the low value results\r\n # in a negetive value, but a low value from a high value results in positive\r\n # therefore, we will only see unusual drops, not unusual gains.\r\n if now - last > test:\r\n print("Sudden drop greater than {} detected. Difference: -{}".format(test, now-last))\r\n return True\r\n return False\r\n{code}\r\nStatistics should be passed as the "test" value. That being, average drop in users when there are not issues, such checking the max drop in the past 24 hours. This will prevent needing to reconfigure the script.\r\nExample method would be finding the last location where logins begin to drop, and sampling until they begin to rise again, then putting the data through something like:\r\n{code:python}\r\ndef getMaxDrops(samples):\r\n last = 0\r\n drops = [last]\r\n for sample in samples:\r\n if sample < last:\r\n drops.append(last-sample)\r\n last = sample\r\n return max(drops)\r\n{code}', 'ReOpened Count': 0.0, 'Severity': 'Unset', 'Target Viewer Version': 'viewer-development', 'Why is this feature important to you? How would it benefit the community?': 'Something like this, if not already implemented, can work as a warning system if something has gone wrong with the network, login server, or otherwise.\r\n\r\nExample of the code in action:\r\n!statistics.png!', } ```
sl-service-account commented 4 years ago

Kyle Linden commented at 2020-09-02T18:48:09Z

Hi Chaser,

We definitely monitor concurrency but new ideas are always welcome.

Thanks!