niklas-heer / speed-comparison

A repo which compares the speed of different programming languages.
https://niklas-heer.github.io/speed-comparison
MIT License
508 stars 79 forks source link

The accuracy metric is flawed #78

Closed giordano closed 2 years ago

giordano commented 2 years ago

If I'm reading https://github.com/niklas-heer/speed-comparison/blob/8cdc564ad5e346e5ef830cf633366c4d29fbee9e/scmeta/src/scmeta.cr#L107-L112 correctly, the accuracy is computed not by calculating the numerical distance from a reference value, but the string distance from its visual representation. This is completely flawed because it fails to take into account where in the string the difference is, or how much it is. For example, according to this formula computed_pi = "87.415926535897" would have accuracy of 0.8, the same as computed_pi = "3.1415926535123", which of course doesn't make much sense. Oh, 0.8 is also the accuracy of computed_pi = "abc415926535897".

Instead computed_pi = "3.14" or even computed_pi = "3" would have accuracy of 1, because the comparison is limited to the same number of characters as the input string, despite the fact these are pretty gross approximations of the value of $\pi$.

A better accuracy metric would be something like $|1 - x / \pi|$, where $x$ is the numerical computed value of $\pi$, or some variation of this formula.

francescoalemanno commented 2 years ago

I propose this metric based on the calculation of @giordano:

accuracy(x) = -log10(abs(1-x/pi))

This is more in-line with the idea that Niklas wanted to implement, but it is mathematically sound. In fact this metric estimates the number of correct digits, without the pitfalls of the "string metric". Examples:

accuracy(3.14) = 3.29
accuracy(3.145) = 2.96
accuracy(3.1415) = 4.53
accuracy(3.1415926535897) = 13.5
accuracy(pi) = Inf

Adversarial Example:

accuracy(2.1415926535897) = 0.49
niklas-heer commented 2 years ago

@giordano yes you read that correctly. Thanks for pointing that out. It is true that it doesn't consider where in the string the difference is.

@francescoalemanno would be an interesting metric, so basically the higher the number the closer it is. 🤔 I don't know, I prefer to express it in percentages.

That would be my current solution: https://github.com/niklas-heer/speed-comparison/blob/a28eb60ce52caa157f32f884eef759df7457480d/scmeta/src/scmeta.cr#L103-L114

Which would express it like this: grafik

niklas-heer commented 2 years ago

Switched now to what @francescoalemanno suggested.

accuracy(x) = -log10(abs(1-x/pi))

niklas-heer commented 2 years ago

Should be fixed by #79