Closed LeaVerou closed 5 months ago
I think it is a matter of efficiency and testing what matters. In my library, I can test all relevant tests pretty fast, and Python is not known for its blazing fast speed 🙃 .
➜ coloraide git:(main) ✗ python3 -m pytest tests/test_hsluv/test_snapshot.py
========================================================================================== test session starts ==========================================================================================
platform darwin -- Python 3.11.3, pytest-7.1.3, pluggy-1.0.0
rootdir: /Users/facelessuser/Code/github/coloraide
plugins: xonsh-0.13.4
collected 1 item
tests/test_hsluv/test_snapshot.py . [100%]
=========================================================================================== 1 passed in 1.28s ===========================================================================================
My comment from the test:
The snapshot contains a number of unnecessary tests, at least for our purposes. It is sufficient to verify that
our path from sRGB to HSLuv and HPLuv is correct, both forward and backwards, as our conversion path is noted
below:
srgb -> srgb-linear -> xyz-d65 -> luv -> lchuv -> hsluv
srgb -> srgb-linear -> xyz-d65 -> luv -> lchuv -> hpluv
```
Verifying each step in the chain makes no sense. If we are able to get through this entire chain and the sRGB
to HSLuv checks out, both forward and backwards through the path, then we can consider our implementation
successful. Any failure in the chain would cause errors to propagate up that would cause the test to fail. It
would only be useful to test each point in the chain if we were trying to identify where in the chain breakage
occurred.
You can see what I actually test here: https://github.com/facelessuser/coloraide/blob/main/tests/test_hsluv/test_snapshot.py
I would advise testing only the forward and backward conversion from HSLuv <-> sRGB and HPLuv <-> sRGB. Don't test XYZ as it is tested in the chain already. I wanted to formally pass the HSLuv and HPLuv tests to claim it was compliant, so I did not test all their data points for Luv and LCh, because again, if I can pass the HSLuv and HPLuv tests, I'm already passing LChuv and Luv. You can do a small set of Luv/LCh tests for completeness if you like.
One thing I'm uncertain of is whether part of the problem is that the test suite generates individual tests or each case generating excessive overhead? If so, I would consider maybe having a snapshot test, one test that validates the snapshot, so no additional overhead per case in the JSON. I imagine, even if I were to test every data point in the JSON in my own test suite, I could only see it increasing the snapshot test time by a little over double. So, I suspect some excessive overhead is generated by the test suite per test case in the JSON. But maybe I'm wrong.
It’s great that we have so many tests for these new color spaces (16K tests!) but that means that now 99% of our testsuite is testing these two color spaces, and
npm run test
now takes several minutes (!).
That was the reason I initially had the tests in the hsluv
directory but somebody suggested moving them to the top level test
directory. :smiley:
I propose we pick a few characteristic tests to run with a regular testsuite run, and the rest can be run explicitly (via e.g.
htest test/hpluv/all.js
), when we really need to thoroughly test correctness for these color spaces.
I'm fine with that.
(also if there's anything on the hTest side that could improve that experience, I'm all ears!)
The majority of the time is spent updating the console with the countdown of the number of tests remaining.
More specifically converting result
to a string in this code causes the majority of the slowdown:
And I think this line in the toString()
method could be problematic because ret.children
might have 4096 tests (x4 if you're running the whole test suite)
I would advise testing only the forward and backward conversion from HSLuv <-> sRGB and HPLuv <-> sRGB. Don't test XYZ as it is tested in the chain already. I wanted to formally pass the HSLuv and HPLuv tests to claim it was compliant, so I did not test all their data points for Luv and LCh, because again, if I can pass the HSLuv and HPLuv tests, I'm already passing LChuv and Luv. You can do a small set of Luv/LCh tests for completeness if you like.
The tests only check sRGB to/from HSLuv and HPLuv.
Note that the test times reported by htest
are misleading and don't account for any overhead added by htest
.
Yeah, that overhead was what I was afraid was happening.
I hacked in some throttling to the node runner and the results look pretty good:
I did make some modifications to the hsluv tests to not parse color strings or create Color
objects which speed up the test times reported by htest
by around 3x.
Changes I made to src/env/node.js
:
done: (result, options) => {
let tree;
// Get the tree every 16ms or when there are no pending tests
if (result.lastUpdate === undefined || performance.now() - result.lastUpdate > 16 || result.stats.pending === 0) {
result.lastUpdate = performance.now();
let messages = result.toString({ format: options.format ?? "rich" });
tree = getTree(messages).toString();
tree = format(tree);
logUpdate(tree);
}
...
I also had to comment out this line in the same file because I was getting a "console.log is not a function" error.
//import console from "./console.js";
Changing from 16ms to 100ms made things slightly faster but less smooth.
Keep in mind that my PC is around 10 years old so those are probably close to worst case results
Good catch, throttling on the hTest side would probably be a good thing regardless. But even if perf was not an issue, I'm not sure having 8K tests testing a single color space is a good thing.
Note that the test times reported by htest are misleading and don't account for any overhead added by htest.
That is intentional, since you typically care more about how long your code takes to run rather than the test runner. They just don 't help in this case 😁
I'm not sure having 8K tests testing a single color space is a good thing.
I generally agree. I think it is a bit excessive. I assume it was done for the same reason I did, you are implementing a space, it has a test suite, so you verify against the test suite. If overhead is not a problem, it is completed fairly quickly.
But, I think at this point, we've verified the spaces, we can probably pick some data points at random as a sanity check, and call that sufficient maybe.
The conversion tests already contain a few tests for HSLuv and HPLuv so I'll just move the comprehensive test suite back to the hsluv directory.
It’s great that we have so many tests for these new color spaces (16K tests!) but that means that now 99% of our testsuite is testing these two color spaces, and
npm run test
now takes several minutes (!).I propose we pick a few characteristic tests to run with a regular testsuite run, and the rest can be run explicitly (via e.g.
htest test/hpluv/all.js
), when we really need to thoroughly test correctness for these color spaces.(also if there's anything on the hTest side that could improve that experience, I'm all ears!)