Closed nishihatapalmer closed 1 year ago
Also, have defined some test constants which were hard-coded before to control how much testing is done for quick and full tests. Makes it much easier to adjust the test load as more tests are added.
Have run the new test -all under valgrind. No memory allocation or invalid read/writes in the test code. Plenty in some of the algorithms though.
A common issue is reading one char before the start of the text, particularly for algorithms with q-grams.
Might be worth extending the pattern at start test to have some room before the text buffer where I can put patterns that should not be accessed. This would be more reliable than just copying 1..m-1 bytes, although that does detect some failures. This would then mirror the pattern past end test.
Found another problem copying data over part of the text in the fixed string tests - forgot to update commit message, so that is wrong.
Was copying data past end of text at position m instead of position n.
OK, still a few issues I can see. Setting as draft until I nail them all.
Definitely something odd going on. I get the occasional seg fault running smart test -all -q -rs 1671026029
.
When I get the seg fault, the tests are not repeating randomly the same way they do with a random seed set. Something is getting corrupted it seems. When all is well, setting a random seed with the same parameters guarantees the same tests are being run.
Also interesting that two runs that crashed have almost identical test results to each other. Only one result for ASKIP is slightly different, all others are the same. But they are completely different to the runs without a crash.
So without a crash we get deterministic tests. With a crash we get kind of deterministic tests compared to each other, but not to the without a crash tests.
So I managed to get my system to actually record core dumps (was configured with a max size of 0!), and look at what was going on when it crashed.
The actual seg fault is in the FS-W2 algorithm. The S2 parameter gets hopelessly negative and seg faults trying to access the text at a bad position. So the test code itself is not seg faulting, we just have a buggy algorithm.
That doesn't explain why the test runs (which should be randomized deterministically from the same seed) vary sometimes. These are two different problems it seems.
When it doesn't crash, it's using a different set of tests. When the results vary, it just accidentally managed to trigger a seg fault in the FS-W2 algorithm due to the perfectly valid parameters it was passed.
So we have two issues:
1) Seg fault in FS-W2 algorithm. 2) Randomised tests are not always deterministically repeated with the same seed.
I have seen seg faults in these algorithms before in fact due to their parameters going wildly out and accessing text at bad positions. So I'm fairly convinced that is an algo bug.
Some more data points:
1) I have failed to reproduce the seg fault in the FS-W2 algorithm with the random seed directly. It only manifests when I run -all the tests, so the preceding algorithms run too. This indicates it is getting a different pattern/text/length to what the random seed would indicate.
2) The random seed variable has not been corrupted. It has the correct value in the core dump.
It's actually quite easy to crash FS-W2, the specific random seed doesn't matter.
The test it fails on is the new partial_pattern_at_start_test, which puts a partial copy of a real pattern (minus the first char) at the start of the text, to see if we can trick the algorithm into reading too far back.
No algorithms seg fault now. Well, none have for a while.
Many algorithms still have bugs in them, but at least all the tests complete and they are deterministic.
What do you mean by deterministic? Is there a seed?
Some algorithms read memory that they shouldn't. If the text and pattern buffers are not initialized to zero, they exhibit different behavior even though they have the same random seed, as they are uninitialized values.
Previously, I would only copy the pattern or text data that was required for a particular test, so there could be bits of those buffers at the end with uninitialized values.
OK. I swear that's the last change I'll make, it's fine to merge.