Clean up and reorder tests

haxscramper commented 2 years ago

Moved to documentation https://nim-works.github.io/nimskull/contributing.html#writing-or-improving-tests

Original issue description

Todo list/guide for cleaning up existing tests - [ ] Check if test name makes sense - `t123123_b.nim` does not make sense, change it to something matching what is being tested - [ ] Reduce number of `echo`-based error testing. If you see direct `echo` in test consider changing it to the `doAssert` check instead - [ ] If possible, provide explanation to the test logic - [ ] Link relevant issues in the test description (`description:` field) or in comments - [ ] Add test labels [link](https://github.com/haxscramper/nimskull/blob/test-naming-cleanup-1/doc/testament.rst#labels) -------------- - [ ] Get rid of completely illegible file names like `t10489_a.nim`, `t10489_b.nim` - if you want to help someone to get to a concrete issue easier, you can make a *comment in a file* with a *proper link*. And that' where you can get a proper file name - original issue title. - [ ] Structure tests around categories, not around some barely navigable mountain of folders that people won't ever be able to figure out which tool/language/stdlib this part belongs to. Top-level tool categories should be - [ ] language (https://github.com/nim-works/nimskull/projects/2) - [ ] stdlib - [ ] compiler - CLI interface, input handling - [ ] other tools - nimsuggest, nim doc - [ ] Provide `description:` field for the test, more documentation comments and general explanation of the features tested. - [ ] Reduce usage of `echo` in tests - `doAssert`, `==` are more than sufficient in a lot of cases. Not *all* of them, but their usage increase locality of the code - no need to look at this header in the 289-line file when this could've been replaced with `return "some string"` and that is later checked with `doAssert()`. [link](https://github.com/nim-works/nimskull/blob/b0933ba30ebb5760190471202bc3529b60bd8264/tests/casestmt/tcasestmt.nim#L46-L64) of how things should *not* be tested. ```nim discard """ output: ''' Not found! Found! 1 compiles for 1 i am always two default for 3 set is 4 not 5 array is 6 not 7 default for 8 an identifier OK OK OK ayyydd ''' """ ``` Tests should be easy to understand for a new contributor, because you change something, then the test fails, and you have *no idea* what this thing was even testing - what could be more frustrating than trying to not only *why* it fails but *what* fails in the first place. Note this is a parallel task to the #10 - language specification most likely will not supersede existing tests, because it is means to *specify the language*, not check if `of` branch works with `when` expression, with `const`, with proc call, with set join, with set intersect ...

saem commented 2 years ago

Reasoning for why this is important:

There are many ideas of what we could change or where to go next, but those are often poorly thoughtout. The sheer volume of what we don't know about the existing system, good or bad, is so large we're working in the dark. This is a blindspot induced by the current state of the code base.

What we do know, with great certainty, is that we need tests that not only test things but also:

teach the reader
are a useful reasoning tools
remember decisions of the past
help uncover lost ideas
shine a light on dark corners and issues

Going through the exercise of improving tests will address the blindspot and is going to teach all of us where to go next and which ideas are more likely to get us there.

haxscramper commented 2 years ago

Another example of tests that we do not want to see - https://github.com/nim-lang/Nim/blob/788cf426123e00a7879dbd9fc765cefb08bd3ddc/tests/overload/tprefer_tygenericinst.nim#L25-L46

# bug #2219
template testPred(a: untyped) =
  block:
    type A = object of RootObj
    type B = object of A
    type SomeA = A|A # A hack to make "A" a typeclass.

    when a >= 3:
      proc p[X: A](x: X) =
        echo "This has the highest precedence."
    when a == 2:
      proc p[X: SomeA](x: X) =
        echo "This has the second-highest precedence."
    when a >= 1:
      proc p[X](x: X) =
        echo "This has the lowest precedence."

    p(B())

testPred(3)
testPred(2)
testPred(1)

It uses templates, generics, when, typeclasses, compile-time evaluation, echo. The when code is also irregular, it does >=, ==, >= so it took a while to understand how it is going to be properly expanded. Ideally, this should be rewritten into

type
  A = object of RootObj
  B = object of A
  SomeA = A | A

block:
  proc p[X: A](x: X): string = "highest"
  proc p[X](x: X): string = "lowest"

  doAssert p(B()) == "highest"

block:
  proc p[X: SomeA](x: X): string = "second-lowest"
  proc p[X](x: X): string = "lowest"

  doAssert p(B()) == "second-lowest"

block:
  proc p[X](x: X): string = "lowest"

  doAssert p(B()) == "lowest"

No templates, no echo, when etc. It is a question whether this piece of code tests exactly the same thing that was tested originally, but it is hard to answer because the previous test is all over the place.

saem commented 2 years ago

There is a change they were attempting to ensure that compile time comparisons work, but that should work because we should test the VM more directly, imo.

haxscramper commented 2 years ago

File is placed in the tests/overloads/... not tests/language_parts_integration/.... Either way it is an example of what we need to remove, or split into meaningful parts, or call this an integration test and not "overloading".

Testing VM separately is a good idea

Instead of just "exactly" I should've said "exactly the same set of properties of the overload resolution"

saem commented 2 years ago

File is placed in the tests/overloads/... not tests/language_parts_integration/.... Either way it is an example of what we need to remove, or split into meaningful parts, or call this an integration test and not "overloading".

Yeah, it's messy. I totally missed the directory structure, I'm just used to most tests not using that meaningfully, can't wait for that habit to break.

Testing VM separately is a good idea

I know someone who is interested in pursuing this, will see if I can get them excited, started, and unblocked.

Instead of just "exactly" I should've said "exactly the same set of properties of the overload resolution"

From context and rereading I see what you meant, good point.

haxscramper commented 2 years ago

[ ] misc/trunner.nim must be split into several smaller test files they could be ran separately. Right now failure in a single assert fails whole test, and it is hard to spot what went wrong
[ ] tshould_fail must be removed and replaced with collection of smaller tests that test testament binary directly (it's output, exit codes and so on)

nim-works / nimskull

Clean up and reorder tests #41