ADS is the Oracle Data Science Cloud Service's python SDK supporting, model ops (train/eval/deploy), along with running workloads on Jobs and Pipeline resources.
Made following changes to optimise report loading for anomaly detector:
If non-anomolous data points > 1000, it downsamples them to 1000. I chose the threshold as 1000 because visually a plot with more than 1000 datapoints is too crowded.
All anomolous data points are included
In the report, we show the whole dataset which leads to very large file size. Hence, we are showing 1000 data points at max here as well.
The optimisation can be turned off by passing optimize_report = False in spec
Made following changes to optimise report loading for anomaly detector:
Results: I have used cpu_utilization_asg_misconfiguration.csv from NAB which has 18051 data points.
Report without optimisation: Loading time - 15 seconds Size - 6.2MB
Report with optimisation: Loading time - 2/3 seconds Size - 411KB [
](url)