StanfordPL / stoke

STOKE: A stochastic superoptimizer and program synthesizer
http://stoke.stanford.edu
Apache License 2.0
737 stars 75 forks source link

stoke testcase labels all Testcases "Testcase 0" #944

Open travisdowns opened 7 years ago

travisdowns commented 7 years ago

When creating test cases using stoke testcase, all testcases are labeled Testcase 0::

Testcase 0:

SIGNAL 0 [normal exit]

%rax     00 00 00 00 00 00 00 00
%rcx     00 00 00 00 00 00 00 00
...

I assume that they are supposed to be generated with incrementing IDs, like Testcase 0:, Testcase 1:, etc. It doesn't seem to actually affect their use, as I suppose the testcase reader just numbers them internally itself.

stefanheule commented 7 years ago

Yeah, that seems like a bug, but also seems like it's not affecting anything.

travisdowns commented 7 years ago

Yeah it's mostly a readability issue. As I'm tuning my test cases, I'm often looking in the testcase file and would like to include an additional test case, but I have no idea what index I'm actually looking at. OTOH, if the testcases were correctly labeled, it would be an open question whether the consumers should continue internally numbering them based on file position versus using the label. If you use the label, you have a lot of issues like what about duplicate labels, what about if the file skips labels, and you can't concat two .tc files to create a new valid one anymore.

I think a reasonable approach is just to use a hash of the testcase state to generate an informational label like DJFU1345, and then when validation fails, you can output that. Right now, if you have 10s or 100s or more testcases, it's sometimes a pain to figure out where they came from. Or just drop the label entirely from the .tc format?

stefanheule commented 7 years ago

Yeah, I agree, all of these suggestions seem better than what we have right now. Having a label in the testcase file actually seems like a bad idea, because it's easy for that label to get out of sync (whether it's a number, or a hash, or whatever else), and you have to be careful about duplicates, etc. So maybe it would be best to get rid of the label, and maybe report errors as "test case #n (where n is the n-th test case in the file, without being labeled as such) on line X". This means no labels to get out of sync, and errors are still reasonably easy to understand thanks to the line number.

travisdowns commented 7 years ago

Well the hashes can't really get out of sync, since they are deterministically created from the testcase state, and they don't relate to position in the file. They are correct as-is. With enough characters in the label, the chance of a collision approaches zero.

Now you might have trivial collisions when two test cases are actually identical, but it seems fine and correct to give them the same label in that case. Just so you can actually find that case in the file. Reporting the nth case in the file doesn't really solve the issue since who knows where the nth case in the file is?

You mentioned line number though - that would be awesome. Then you could jump directly to the testcase...

stefanheule commented 7 years ago

The hash gets out-of-sync as soon as you start changing test-cases manually, with no way for the user to fix the hash manually in any easy way.

travisdowns commented 7 years ago

Yeah, correct. I was always too lazy to do that, but it's a good use-case since as I see it the idea of stoke is to keep the tools decoupled - i.e., you don't need to use stoke extract to generate the test cases, I guess.

I like the idea of line numbers...

stefanheule commented 7 years ago

Correct, and even if you originally do, you may want to change those test-cases later (I certainly have changed tests by hand many times).