clinthuffman / PAL

Performance Analysis of Logs (PAL) tool
MIT License
481 stars 101 forks source link

Standard Deviations and Outliers Removed #52

Open dgunkjr opened 5 years ago

dgunkjr commented 5 years ago

I am evaluating the Physical Disk Avg. Disk sec / Write counter. Performance Monitor says the average is .065 but PAL says .028 under overall counter instance statistics. Min / Max are the same between the two tools.

I've never considered precisely what the Standard Deviations and Outliers Removed (10, 20, 30% etc.) mean. What do they mean and how do I "reconcile" them when interpreting the results?

Thanks!

clinthuffman commented 5 years ago

Hello,

After a lot of research years ago, I concluded that “Standard Deviation” is the average distance from the average. Meaning if all of the values in an array are 5 (i.e. 5,5,5,5,5,5,5,5), then the Standard Deviation is 0 because every value is 0 away from the average. In the case of 0,10,0,10,0,10,0,10, the average is 5, but none of the values are close to the average which is 5, so the Standard Deviation is 5 – meaning that each value on average, it 5 away from the average. Basically, the lower the Standard Deviate, the more stable/”trust worthy” the values are. The higher the Standard Deviation, the more erratic/chaotic the values.

In regards to the outliers, I couldn’t figure out the proper mathematical term, so I came up with my own term. In my case, 10% outliers removed, means that 10% of the values that are furthest away from the average are removed. Sometimes counter values can be extremely wrong, say, 5,5,5,5,999999999999,5,5,5,5. The 999999999999 is extreme compared to the other values and is probably a corrupted value. 10% outliers removed removes extreme values like 999999999999 and then recalculates 5,5,5,5,999999999999,5,5,5,5 an average.

In any case, .028 is a high value and based on the counter is an average of an average, so it means that the disk was in this state for some time and anything above 0.025 (25 ms) greatly effects the responsiveness of the system.

Thank you,


From: dgunkjr notifications@github.com Sent: Monday, September 23, 2019 9:42:25 AM To: clinthuffman/PAL PAL@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Subject: [clinthuffman/PAL] Standard Deviations and Outliers Removed (#52)

I am evaluating the Physical Disk Avg. Disk sec / Write counter. Performance Monitor says the average is .065 but PAL says .028 under overall counter instance statistics. Min / Max are the same between the two tools.

I've never considered precisely what the Standard Deviations and Outliers Removed (10, 20, 30% etc.) mean. What do they mean and how do I "reconcile" them when interpreting the results?

Thanks!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fclinthuffman%2FPAL%2Fissues%2F52%3Femail_source%3Dnotifications%26email_token%3DAEXDYVCTOFFJRPNN3BEFCLTQLDWXDA5CNFSM4IZNF42KYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HNCZ4EQ&data=02%7C01%7C%7C3eab1d7bdddd4329930c08d740450436%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637048537467786639&sdata=a4rl8Z7xywofp5R%2B5g%2BshYKsrr78j0y3cbEm2%2BDvJzg%3D&reserved=0, or mute the threadhttps://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAEXDYVH2IT2D3AVW7VBAHJDQLDWXDANCNFSM4IZNF42A&data=02%7C01%7C%7C3eab1d7bdddd4329930c08d740450436%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637048537467796651&sdata=%2BYdr%2B822h9RnZdR3coVO55pO6CicdQUiVDOUbRhZXAg%3D&reserved=0.

dgunkjr commented 5 years ago

Thanks for the explanation.