This PR fixes the a bug in the tests. In the previous PR I switched the top-k to not sort the results (since they are sorted later, and that particular operation was one of the most expensive operations during inference). But since our tests are poorly configured (where there are many ties for best result, more than the options we return), I had to overwrite the expected results. Since this is becoming common, I rewrote the overwrite functionality to just change the results which are different (and for floating point numbers this means not changing them if they are within the error tolerance). This way, we can easily see exactly what changed about the tests when we use --overwrite.
This PR fixes the a bug in the tests. In the previous PR I switched the top-k to not sort the results (since they are sorted later, and that particular operation was one of the most expensive operations during inference). But since our tests are poorly configured (where there are many ties for best result, more than the options we return), I had to overwrite the expected results. Since this is becoming common, I rewrote the overwrite functionality to just change the results which are different (and for floating point numbers this means not changing them if they are within the error tolerance). This way, we can easily see exactly what changed about the tests when we use
--overwrite
.