official-stockfish / fishtest

The Stockfish testing framework
https://tests.stockfishchess.org/tests
270 stars 126 forks source link

Fix red color `pre` before approving #2089

Closed peregrineshahin closed 2 days ago

peregrineshahin commented 3 days ago

currently fixed games test_view results_pre shows a red color if the test isn't even approved, fix that since the state isn't really rejected.

e.g. https://dfts-0.pigazzini.it/tests/view/66818436b72c7bf0c0059a07

vdbergh commented 2 days ago

This due to me. I changed the coloring algorithm for fixed length test to be in line with SPRT tests: red = rejected (i.e. the opposite of passed). However this means that no colors should be shown while the fixed games test is running, which is not what people expect.

In retrospect, fixed length tests have a different function from SPRT tests: their primary use case is measuring ELO. So it is not surprising that the coloring algorithm is different.

I will revert.

PS. In the future Fishtest will no longer store the results widget in the db. This means that it will be possible to change the coloring algorithm retroactively.

peregrineshahin commented 2 days ago

I will revert.

Okay there is also the case of yellow appearing that's novel I think. I will close this in the meantime

vdbergh commented 2 days ago

I would suggest to leave this issue open until the revert has actually happened.

peregrineshahin commented 2 days ago

Well this is a PR not issue..

vdbergh commented 2 days ago

Ah sorry I had not noticed this.

dubslow commented 2 days ago

However this means that no colors should be shown while the fixed games test is running, which is not what people expect.

This is exactly what people expect, and how it's always been. Or, more precisely, if elo is within error bar of 0, then no color; if error bar is above and excludes 0, green, and otherwise, red; by the default has always been no color.

vdbergh commented 2 days ago

However this means that no colors should be shown while the fixed games test is running, which is not what people expect.

This is exactly what people expect, and how it's always been. Or, more precisely, if elo is within error bar of 0, then no color; if error bar is above and excludes 0, green, and otherwise, red; by the default has always been no color.

You are contradicting yourself. People do expect colors to be shown while the test is running as you state in the last sentence (note that such colors are statistically meaningless).

But I already said I would revert so we should not discuss this further.

dubslow commented 2 days ago

Yes, my bad, I committed a sign error while reading the original sentence :facepalm: