trackme-limited / trackme-report-issues

The purpose of this repository is to allow Splunk community to report issues and enhancements requests
2 stars 0 forks source link

feature - Machine Learning Outliers - allow using a custom MLTK algorithm #577

Closed chrishylanduk closed 5 months ago

chrishylanduk commented 5 months ago

Is your feature request related to a problem? Please describe. Currently TrackMe uses the MLTK DensityFunction algorithm for outlier detection. This works well in some instances, but in others it does not. Changes that users might want to make, and which aren't currently possible, include:

  1. Segmenting time by working day vs non-working-day (weekends + local bank holidays)
  2. Accounting for local clock changes
  3. Fitting a Poisson distribution
  4. Modifying how BoundaryRanges are calculated for beta and gaussian KDE distributions, as DensityFunction currently has a bug which means the lower boundary can be above the upper boundary
  5. Using raw percentiles in the training period, rather than fitting a model
  6. Using time series forecasting
  7. Only flagging outliers if x periods in a row are outside the boundaries
  8. (No doubt many more!)

Describe the solution you'd like TrackMe trying to meet all the above use cases, and more, doesn't seem realistic. However, if TrackMe allowed using a custom MLTK algorithm, rather than DensityFunction, then users could build exactly the models that meet their requirements. I presume there would need to be a specification of what inputs / outputs the algorithm would need to accept / return, so TrackMe could interpret them correctly.

Describe alternatives you've considered

guilhemmarchand commented 5 months ago

Hi @chrishylanduk

Thank you very much for raising these requirements / ideas and feature requests, this is extremely valuable and I have been taking these into accounts carefully.

In TrackMe 2.0.89, the Outliers framework has therefore been extended with the following new options (all options can / have to be set at the system level but can also be modified at the model level)

System level:

CleanShot 2024-04-21 at 19 56 46@2x

Models update level:

CleanShot 2024-04-21 at 20 00 59@2x

Adding new model:

CleanShot 2024-04-21 at 20 01 54@2x

As far as I am concerned, these new capabilities should reflect your thinking and address these properly ;-)

Any further comment, please do not hesitate, if not the next release these would be taken into account for a for further release.

chrishylanduk commented 5 months ago

This looks brilliant, thanks a lot @guilhemmarchand - looking forward to trying it out!

guilhemmarchand commented 5 months ago

Hi @chrishylanduk

TrackMe 2.0.89 is live in Splunk Base - any new requirements or anything I've missed, please let me know and we'll address these in the next release ;-)

Guilhem