Open Dunbaratu opened 10 years ago
We NEED regression testing. I have been looking at moving all of the code that doesn't directly depend on KSP into another assembly so we could add a unit testing framework.
While I like your idea of a separately compilable subset of the code that doesn't depend on KSP for the sake of testing the language, I imagine that taking a lot of work to set up, and a lot of redesign. What I was thinking of is doable now even with the KSP API calls intertwined with the rest of the code. Just put a SCS module on the launchpad in-game and use it to run the test script and spit out a LOG in the archive.
Once you have the output as a LOG in the archive, then external programs can perform the diff checking against previous results and prepare a report on what changed.
Sorry, i didnt mean to say that i didnt like your idea in the OP. In fact i think we should do both of these in the long run and yours can be done now. I always vote YES! for regression testing :)
I started looking into this yesterday for a short while and I think this is the design I'm going to go with. Let me know what you think:
In the KSP-KOS/KOS project on git, I add a directory called kos_test. It contains this:
KOS
|
+-- kos_test
|
+-- script
| |
| +-- kostest_syn_all.txt
| +-- kostest_syn_ifelse.txt
| +-- kostest_syn_progcalls.txt
| +-- kostest_syn_onwhen.txt
| (etc for all test scripts I come up with)
+-- out
|
+-- kostest_syn_out.txt
To run a test:
script
from your local git branch to your KOS archive.run kostest_syn_all
. It in turn will run all the other kostest_syn* programs.kos_test/out/kostest_syn_out.txt
I was wondering how to diff the results versus the previous results and how to track a history of previous results when I had a head-slap moment and realized that everything needed for that is already part of the git tools, if I just make the test results into a git-tracked file.
The code was designed to be very modular and as such you can execute a lot of parts from outside of KSP. As a matter of fact I wrote a small program that emulates the console to debug the compiler and test the grammar. Given that, it wouldn't be too difficult to write test cases using a framework like NUnit to try and compile every instruction and compare the result with the expected output. That along would prevent most issues resulting from modifications to the compiler. Since you can also execute the CPU from outside of KSP is possible to test the execution of more complex structures like loops, lists, waits, etc. And we could also setup a Continuous Integration system (like TravisCI) to run the test suite on every commit.
Of course a lot of other things have to be tested from inside of KSP and for that we can have a couple of scripts, but I don't really like the idea of committing the results, it would create to much bloat.
I don't see why it's a bad idea to commit the results. I think it makes sense to use git's own tools to store "these are the results that are expected with this particular version of this particular branch."
I have two issues with commiting results this way.
This is why i split the DOC project into its own thing, for better or worse.
I do agree that it would put things in the git log that are not actually code. But that very same feature you're seeing as a bad thing is what I see as a good thing. "This edit to the documentation refers to a new thing that didn't exist yet until release 0.13, that's why this particular edit to this file in the documentation is part of release 0.13's files." It also helps enforce the mentality that end-user docs and test data are part of the programming project, not side projects that can be ignored.
Whether there are other tools better suited to testing is a separate issue entirely from whether or not the testing data (regardless of what testing tool the data is for) should be in the git archive under revision control or not.
As for the documentation, the only reason it sort of makes sense to keep it separate is because of a limitation of Middleman system its written on - that it insists upon putting docs in branches it wants to name its way. It's not because docs in the code repository are a bad idea in general - they're just bad in this particular case because of that artificial limitation of the tool we're using.
@marianoapp i have noticed that much of the code could be moved into another project with no dependence on UnityEngine and that would be pretty cool. How do you think that will work with the CPU when the CPU has a reference to SharedObjects?
The biggest issue with TravisCI or any other CI tool is that we cannot redistribute the UnityEngine or Assembly-CSharp DLLs that we need to build it.
@Dunbaratu I now feel silly for conflating the Docs and the Test Results i was apparently overtired. I wouldnt mind bringing DOCS back into the main project. It was likely a mistake to move it out. The middleman path/branch issue is not a huge deal to fix :)
A CI tool is not just a good place for testing, It is a better place for testing data. Crufting up the repo with hundreds of files is bad enough but the real trouble with storing the results is that they would almost always be in a separate commit once we have CI. I would prefer to keep CI out of the business of adding anything other than tags to the repo.
So how do other collaborators on the project see whether or not their change caused an unexpected difference from how things worked before when you won't include the data for the testing tool in the repo? That's a fundamental question about the testing that is not dependent on which tool you use.
a website and/or an email. They could also always run the tests locally.
the tests would always be in the repo. The results would not.
What good is automating the testing when the testing tool is unaware of what the intended results are because they're not in the files downloaded from the repo?
I am saying that Everything needed to run the test and determine their fitness is in the Repo. The actual pass/fail or scalar result from running the tests would stay on each box and the CI environment would have a place for you to see if something failed in an automated way.
Are we talking about the same thing?
I think you misunderstood what I was asking for then. When I was talking about storing the results in the repo it was because that was how the primitive method I was describing would record "this is the definition of a 'pass' result - it's a pass if the output match this." The reason for it being in the repo is because the definition of a 'pass' can change when features are changed with future releases, so you put the definition of a 'pass' under revision control so it can match the software revision in question that its part of. As long as the software change being merged in wasn't supposed to change the expected test result, the test result files in question remain unaltered and therefore not part of the pull request. They only get updated when the developer deliberately changed them to match a new expected result, or when a new test case was added to them - just like any other source file works.
I don't see how using a different tool makes it a bad idea to store the "this is the expected result of this test" data so that it's recorded alongside the code changes in the repo. The format of such data will be different with a sophisticated automated tool, but it will still need some sort of data that says "you'd better expect this as the result", and that, whatever form it takes, should be under the repo control so it can be updated and versioned with everything else.
@Dunbaratu do you still have these tests? i was writing a similar test for the new quicksave stuff im working on and i remembered this conversation.
somewhere. I can look for them late tonight
Isn't it simpler to use a custom TELNET client?
Using it we'll be able to test exceptions thrown too.
Did you find the scripts?
Is this something that might be worth pursuing as a regression/integration test for the KSP 1 to KSP 2 migration?
Please let me know what you think of this idea, especially @marianoapp and @erendrake
When I implemented the ELSE operator (issue #6), it was my first attempt to alter the syntax definition, and I was nervous about breaking something so I checked it afterward with the following kosscript designed to try to cover every possible permutation of IFs and ELSEs that I could think of, including trying to regression-test for things that shouldn't have changed about it:
It occurs to me that expanding a thing like that into a very complex long script full of WHENs, LOCKs, FORs, UNTILs, Program calls, and so on might be a good idea for trying to exercise a bit of everything in the language, so that after a change you could run the test and compare the results to the results of a previous run, and look at the diff between them to make darn sure that any change you see is a change you meant to see. Furthermore, when a new bit of syntax is added, you could update the regression test script to add checks for your new thing. Then when you diff it, you expect those changes for your new syntax, but are also making sure nothing previously working became broken.
If people think this is a good idea I would be happy to try to work out a starting version of it. The idea would be that it would be added to the github repository and using it to check any language feature update would be a standard practice people should get in the habit of. It would be a thing you could do as part of a check to see whether or not to accept a pull request.
Here's a few design bullet-points about how I picture it working:
LOG
commands). This is so you'd have a file of the output on your computer that you can look at with an external diff tool.LOG
command ends up giving you a disk file, rather than a bunch of lines in a section of your persistence file.