immunant / IA2-Phase2

4 stars 0 forks source link

Improve Testing Infrastructure #154

Open rinon opened 2 years ago

rinon commented 2 years ago

Our test infra for this project is pieced together and requires a significant amount of duplication to add new tests. We should probably automate the bulk of this to avoid error-prone duplication between tests.

rinon commented 2 years ago

We discussed moving tests around to condense header-rewriter tests and functional tests. Main goal is to make adding new tests easier, rather than having to replicate an entire existing directory.

fw-immunant commented 1 year ago

Cross-referencing a Slack comment that summarized the discussion the last time this came up:

Frances and I were talking about how to improve our testing framework last week and here are some notes from the conversation. They're a bit unorganized, but listed roughly in order of importance. The main issue is that writing tests is cumbersome since no one has taken the time to improve how we write tests, so we usually end up copying old tests and changing them.

  • we don't have a good way to test small variations of a program's runtime behavior and currently use ad-hoc arguments to change each test program's behavior
    • we should factor argument parsing out of our tests
    • we should write macros to let us reuse one set of source files for various related tests
    • should each test variation be a separate binary? it may be easier to build just one binary like we do now
  • we should treat variations of each program as separate tests
    • the LIT commands in each header show up as one test, so if one variation of a program fails you have to run it manually to see which failed
    • if LIT always treats each file as one test, we could either generate files with the LIT commands at build time or use something other than LIT for testing the runtime
  • we should consider switching from diff to FileCheck for comparing stdout from tests
    • this would let us write tests that print addresses
    • pointer comparisons should still be done by the test program at runtime, but this would provide quicker ways to sanity-check a program
  • we don't distinguish between rewriter tests and runtime tests
    • the older tests are mainly for the rewriter and have simple runtime behavior
    • the newer tests have more complex runtime behavior but we're sometimes still adding the LIT annotations for rewriter checks even though it doesn't add to test coverage
    • could consider separating runtime/rewriter tests into subdirectories or at least stop adding LIT annotations that don't increase coverage (editado)