Open caltonji opened 3 years ago
Thanks for your feedback, and thanks for using our tool. We are working on a multivariate version that does consider combinations of metrics to better find out the root cause of the issue. Will add it as soon as we have the extension ready.
Thanks, Ashkan
I am using the feature ranking within MCT for automated Root Cause Analysis of incidents for our Rest API (e.g. increase in InternalServerError responses from our service). Our dataset is our IncomingRequests (uri, datacenter, responsecode, latency, requestId, etc.), merged with OutgoingRequests (target, responsecode, latency, requestId, etc.). If, for example, we are returning InternalServerError because we received 429 in our first call to DocDB, then our metric column, ResponseCode will equal InternalServerError and a feature column DocDB_GetThead_ResponseCode will equal 429 and we expect our automated Root Cause Analysis tool to tell us that the reason for InternalServerError increase is DocDB_GetThead_ResponseCode == 429.
Ours is a situation of multicollinearity. If the first call to DocDB fails, then all subsequent calls will not happen so for all of the failures, another column, say DocDB_UpdateThead_ResponseCode will be empty. We would like auto clustering, so that instead of producing some 200 "Features Explaining Metric Difference" with the actual root cause buried beneath the noise, we instead produce a handful of combinations of features that are correlated to our metric.
Thank you for your awesome work with this tool!