Closed martijnbastiaan closed 1 year ago
@Bodigrim @VictorCMiraldo I'm eager to get this PR merged, my aim is to fix all of the gotchas listed in Tasty's README regarding dependencies - these affect the main project I'm working on:
- If Test B depends on Test A, remember that either of them may be filtered out using the
--pattern
option. Collecting the dependency info happens after filtering. Therefore, if Test A is filtered out, Test B will run unconditionally, and if Test B is filtered out, it simply won't run.- Tasty does not currently check whether the pattern in a dependency matches anything at all, so make sure your patterns are correct and do not contain typos. Fortunately, misspecified dependencies usually lead to test failures and so can be detected that way.
- Dependencies shouldn't form a cycle, otherwise Tasty with fail with the message "Test dependencies have cycles." A common cause of this is a test matching its own dependency pattern.
Using dependencies may introduce quadratic complexity. Specifically, resolving dependencies is O(number_of_tests × number_of_dependencies), since each pattern has to be matched against each test name. As a guideline, if you have up to 1000 tests, the overhead will be negligible, but if you have thousands of tests or more, then you probably shouldn't have more than a few dependencies.
Additionally, it is recommended that the dependencies follow the natural order of tests, i.e. that the later tests in the test tree depend on the earlier ones and not vice versa. If the execution order mandated by the dependencies is sufficiently different from the natural order of tests in the test tree, searching for the next test to execute may also have an overhead quadratic in the number of tests.
This PR together with a followup PR should achieve that. I know it's quite a bit to review, so I've tried to add enough tests to hopefully convince you that it's at least somewhat sound. If you'd like a video call to talk this through, I'm open for that too! If there's anything else I can do, ask away.
@martijnbastiaan sorry, I don't have bandwidth for this right now. Could you possibly summarise all breaking changes here (both API and behaviour)?
CC @VictorCMiraldo @adamgundry @andreasabel as other maintainers.
No worries @Bodigrim, thanks for the reply.
The API changes are as follows:
afterTree
, sequentialTestGroup
.TreeFold.foldAfterTree
. This allows folding AfterTree
(a constructor not part of the public API). Given the existence of trivalFold
, I don't expect this to affect existing programs.TreeFold.foldSingle
takes an extra argument ExactPath
. I would actually like every function in TreeFold
to take an ExactPath
(its position in the TestTree
), but I decided against is to minimize API changes. Given its usefulness I'd like to ask the maintainers here to make the same consideration - and let me know if they'd opt for adding the ExactPath
everywhere.There are no behavioral changes to existing programs.
Thanks for the PR @martijnbastiaan! That is a large one! :)
Unfortunately, I don't have the bandwidth nor the familiarity with the mechanisms surrounding tasty
's test dependencies. Hence, I'm not sure I can give a good review on this anytime soon.
Understood!
Hmm. So to give some background: I wrote this patch to resolve some problems we face with our 4K large testsuite over at clash-lang/clash-compiler, a project me and my colleagues at QBayLogic maintain.
What I could do is ask a colleague for a review on this patch (and the follow-up one). Would that be enough -together with the listed API changes I posted earlier- to accept this patch? I'm of course willing to maintain the feature too, and deal with the fallout if any.
TreeFold.foldSingle takes an extra argument ExactPath.
Can this breaking change be avoided by introducing a new fold and expressing existing foldSingle via it?
Yes, we could add foldSingleExact
that will (by default) simply call foldSingle
.
@martijnbastiaan could you please share what kind of dependency graph is typical for Clash test suite? Is it chains of depending tests?
Sure. To give a bit of context: Clash is a Haskell to silicon compiler - meaning Haskell can be used to build chips that could be synthesized by say a TSMC (or more realistically; it's used to program a reconfigurable chip). Silicon is typically written in a Hardware Description Language (HDL) which is what Clash targets as a compiler output. There are currently three major HDLs: Verilog, SystemVerilog, and VHDL. Each of these languages have various simulators that can be used to predict chip behavior before actually burning it into silicon. Clash itself can also be simulate (simply by leveraging GHC!).
So, one test looks something like:
graph TD;
FooTest-->VHDL;
FooTest-->Verilog;
FooTest-->SystemVerilog;
VHDL-->Clash_VHDL;
VHDL-->Clash_Sim_VHDL;
Verilog-->Clash_Verilog;
Verilog-->Clash_Sim_Verilog;
SystemVerilog-->Clash_SV;
SystemVerilog-->Clash_Sim_SV;
Clash_SV-->ModelSim;
Clash_SV-->Verilator;
Clash_VHDL-->GHDL;
Clash_VHDL-->Vivado_VDHL;
Clash_Verilog-->Vivado_Verilog;
Clash_Verilog-->IcarusVerilog;
I'm glossing over some things, but each new (regression) test looks a bit like this. This makes the number of dependencies grow linearly with the number of tests, hence us running into this caveat mentioned in the README:
Using dependencies may introduce quadratic complexity. Specifically, resolving dependencies is O(number_of_tests × number_of_dependencies), since each pattern has to be matched against each test name. As a guideline, if you have up to 1000 tests, the overhead will be negligible, but if you have thousands of tests or more, then you probably shouldn't have more than a few dependencies.
With regards to:
Changing TreeFold is likely to break all third-party tasty ingredients.
I'm willing to help out patching them, of course.
I agree that the infrastructure for dependencies between tests is wanting, but I feel that your approach covers only a fraction of cases, but required changes are quite profound. My use case is dependencies between disjoint, potentially remote trees, see https://hackage.haskell.org/package/tasty-bench-0.3.2/candidate/docs/Test-Tasty-Bench.html#g:4
If the shape of your dependency graphs is relatively stable, might your use case be better served with a custom IsTest
provider?
The only real limitation is that you can't make a DAG, you can only make a tree. I've checked out the example of bcompare and the examples it links (chimera
, fast-digits
, unicode-data
). Correct me if I'm wrong, but it seems like none of those examples need the TestTree
to be a DAG. I think all of these examples just want to run a single test as a baseline (left part of afterTree
) and compare it against a bunch of other tests (right part of afterTree
). I'll respond to the other points too, but I'd like to understand the need for a DAG better before I do.
Correct, none of my examples needs a DAG. However, the shape of TestTree
is not necessarily in correspondence with a dependency tree.
Sure, that's true. I'll be honest, it's kinda hard to respond to this without sounding very negative/argumentative. It seems to me that you're paying a pretty high price for the ability to do so. The examples need to construct patterns such as:
"$NF == \"" ++ show n ++ "\" && $(NF-1) == \"Chimera\""
..which would break if there's ever another test group called Chimera
anywhere in the tree, quite seriously contradicting the composability of TestTree
s. It has all the downsides mentioned in the README. Last but not least, for a casual reader it's pretty hard to even read the code - given that you need to do the same process Tasty does when finding a dependency: scan the whole tree.
So all in all, it seems somewhat "obvious" to me that you'd want something like afterTree
. And as far as the complexity goes.. well that's kinda imposed to us by the existence of after
itself. If we only had afterTree
, there would have been no need for Trie
or DependencyTree
(from the follow-up PR). All would naturally follow from the trees construction and traversal - the code would be even simple than what's now in main
.
So, in a way I feel that this remark:
but I feel that your approach covers only a fraction of cases
turns everything on its head. It seems to me that after
should try and justify its costs. Is it really worth having this pattern matching when it messes with composability, computational complexity, correct-by-constructionness, and code complexity?
I guess what I'm asking is: is there any chance of this getting merged? If not, do you see an alternative way of tackling the issues in the README, or will Tasty be forever stuck with them?
@martijnbastiaan to be very clear, I agree that the dependency design is inconvenient and I also agree that if we were doing it from the scratch and with our current experience, then AfterTree
would be a saner option. My problem is that we already have a mechanism (constructor After
) for this pretty niche feature, and adding another one (constructor AfterTree
) for even more niche use cases does not strike me as the most desirable outcome. I'm happy to be overruled by other maintainers, my cautiousness is because I'm a caretaker only.
So, answering your question directly, yes, there is a chance, but we must consider the design and long-term consequences. This can take more time than one might like, sorry for that. I do appreciate your work, your efforts and this particular contribution.
What is the prior art for dependencies between tests in other frameworks?
Could you define something like
data ClashTest = ClashTest
{ ghdlTest :: Assertion
, vivadoVdhlTest :: Assertion
, vivadoVerilogTest :: Maybe Assertion
, icarusVerilogTest :: [Assertion]
, ...
}
and provide instance IsTest ClashTest
to manage dependency chain? More generally, you can define newtype AssertionChain = AssertionChain [Assertion]
with instance IsTest AssertionChain
executing tests in order.
Understood. If necessary, I'd be willing to maintain it. My ideal would be to deprecate After
in the very long term, but I understand that takes a lot of time, effort, and commitment - hence not always worth it.
I'll have a look at your proposal and dig into the prior art, hopefully this weekend.
What is the prior art for dependencies between tests in other frameworks?
I've looked into this:
To me it looks like sydtest is very close to doing TheRightThing(tm).
Could you define something like
This can work, but it has quite a few drawbacks by virtue of the tests being opaque to Tasty:
So all in all this would be a pretty big step back IMO.
I could do two things to make this series of patches more acceptable:
AfterTree
almost trivial.-- | An algebra for folding a `TestTree`.
--
-- Instead of constructing fresh records, build upon `trivialFold`
-- instead. This way your code won't break when new nodes/fields are
-- indroduced.
data TreeFold b = TreeFold
{ foldSingle :: forall t . IsTest t => OptionSet -> TestName -> t -> b
, foldGroup :: OptionSet -> TestName -> b -> b
, foldResource :: forall a . OptionSet -> ResourceSpec a -> (IO a -> b) -> b
, foldAfter :: OptionSet -> DependencyType -> Expr -> b -> b
-- "Exact" versions of functions defined above. In 'trivialFold', these simply
-- call the non-exact versions. These functions are here for backwards compatiblity
-- reasons.
--
-- Note: instead of taking a `TreeFold`, these functions could take their non-exact
-- function directly, but this makes the type signatures quite long.
, exactFoldSingle :: forall t . IsTest t => TreeFold b -> ExactPath -> OptionSet -> TestName -> t -> b
, exactFoldGroup :: TreeFold b -> ExactPath -> OptionSet -> TestName -> b -> b
, exactFoldResource :: forall a . TreeFold b -> ExactPath -> OptionSet -> ResourceSpec a -> (IO a -> b) -> b
, exactFoldAfter :: TreeFold b -> ExactPath -> OptionSet -> DependencyType -> Expr -> b -> b
}
I personally don't think it's worth it, but I'm happy to do it if that makes things more acceptable/reviewable.
I do appreciate your work, your efforts and this particular contribution.
Thanks btw, happy to work with you guys.
To me it looks like sydtest is very close to doing TheRightThing(tm).
Would sequential
combinator on its own be sufficient to match your requirements?
For my specific use case it would be both a step forward and backward at the same time.
The positives:
The neutrals:
The negatives:
I'm somewhat biased as I already put in the work of course, but it strikes me as a band-aid instead of a proper solution. Do you think it would be easier to implement?
Well, adding TestTreeSequential
constructor or maybe extending existing TextTree
with a switch Parallel | Sequential
seems to me a better option. You do not lose ability to run certain parts in parallel, because TestTreeSequential
can affect only one level of tree.
Sure, I can amend the patch to do this.
@martijnbastiaan before you do more work, I'd like to hear from other maintainers.
@VictorCMiraldo @andreasabel Existing TestGroup
constructor groups subtrees for (potentially) parallel execution. The proposal is either to add TestGroupSequential
constructor, or to extend TestGroup
with a field data Parallelism = Parallel | Sequential
, which mandates parallel execution of subtrees. How do you feel about this? Currently a similar (but not quite the same) effect can be achieved with a sequence of After
constructors, which is very tedious to write and unbearably slow to execute.
Edited: TestTree
TestGroup
.
@Bodigrim and @martijnbastiaan, first of all, thanks for the great work both making the PR and the thoughtful comments! :)
Currently a similar (but not quite the same) effect can be achieved with a sequence of
After
constructors, which is very tedious to write and unbearably slow to execute.
Yeah, that doesn't sound like a very good solution and I can see how running tests sequentially is an interesting feature.
I do tend to agree with @Bodigrim, however. Adding a AfterTree
constructor feels weird given that we already have an After
. Considering the semantic interplay between them seems like a potential nightmare for me. Can we have weird deadlocks with things like (in pseudo-tasty) t = AfterTree (After "/a/" $ singleTest "b") (singleTest "a")
? IIUC, plain tasty
already has this type of problem with regular TestTree
s.
It does feel more natural to to add a TestTreeSequential
constructor instead of using the dependency mechanism to enforce sequential execution. Also, it doesn't introduce new quirks w.r.t. dependencies such as t
above:
a TestTree u
has a deadlock iff TestTreeSequential u
has a deadlock.
Would a TestTreeSequential
need this change? If not, then it would be fully backwards compatible, which is a huge plus IMO
Potentially we can leave TreeFold
as is, with foldGroup
processing both TestGroup
and TestGroupSequential
. Adding a new constructor to TestTree
is breaking any way, but potentially with less impact on clients.
Fundamentally, TestTreeSequential
will differ very little from AfterTree
. Instead of having two TestTree
fields, it will simply take a list of them. I do think it's the better option: I only ended up using sequentialTestGroup
(a function this PR introduces) in practice. However, it will still need the changes to TreeFold
, unless I find a better way to implement it.
The core of the problem is that this variable:
needs to take TestTreeSequential
(or AfterTree
) into account. While patterns need to scan the whole tree there, AfterTree
/TestTreeSequential
can define much more efficient lookup strategies.
Hmm.. An alternative implementation could implement a custom folder that propagates dependencies up while it is folding. I.e., it would not use foldTestTree
to run the tests. This would avoid the complexities of introducing the Trie
and would also make for a natural solution to the very first caveat mentioned in the README. I'm not sure how this would fit into the rest of the ecosystem though.
I would do this (please find a better name for TestGroup'
):
data ExecutionOrder = Concurrent | Sequential
data TestTree
= forall t . IsTest t => SingleTest TestName t
| TestGroup' ExecutionOrder TestName [TestTree]
| PlusTestOptions (OptionSet -> OptionSet) TestTree
| forall a . WithResource (ResourceSpec a) (IO a -> TestTree)
| AskOptions (OptionSet -> TestTree)
| After DependencyType Expr TestTree
pattern TestGroup = TestGroup' Concurrent
This is a breaking change, but a very minor one. And you can leave TreeFold
as is, ignoring ExecutionOrder
, so there is no breakage here.
Indeed, createTestActions
can fold TestTree
by its own means, not necessarily via foldTestTree
.
FWIW, I like to compromise suggested by @Bodigrim above. Seems like a good option to minimize breakage while enabling @martijnbastiaan to have the necessary feature that motivated the PR in the first place.
Great, thank you both. I've got a patch now that implements the suggestions (adding TestGroup field + custom traversal). While it doesn't try to fix any of the issues raised in the README for pattern dependencies, it does solve them for the new way of specifying deps in a fairly straightforward way.
I'm pretty happy with the patch, but I'm still working on writing a proper testsuite. I'm spending only a few hours on it every weekend so it'll probably take a while..
I've just pushed an alternative implementation based on the comments above. This one doesn't try to fix any issues with dependencies specified using patterns, but does solve them all for dependencies specified using sequentialTestGroup
. TestGroup
is now the primitive setting parallelization/sequentiality. Two backwards incompatible changes have been applied:
1. TreeFold
no longer has a field foldResource
. Its type signature had to change in order for the fold to account for dependencies while filtering. However, it doesn't seem likely downstream will be affected at all:
foldResource
on all of GitHub only yields tasty
using it.tasty
in their name on Hackage does not yield any relevant usages of TreeFold
either.If this change proves controversial, I suggest we move the second commit in this PR to a separate PR and discuss there.
2. TestGroup
has an additional field ExecutionMode
. @Bodigrim, you suggested making a pattern synonym like this:
pattern TestGroup = TestGroup' Concurrent
But, libraries using this pattern will silently fail:
case tree of
TestGroup _ -> ...
_ -> ...
Libraries using the following pattern will fail with just a GHC warning:
case tree of
TestGroup _ -> ..
[.. exhaustive list of constructors ..]
To get a feel for the breakage, I've cloned all the libraries published on Hackage related to tasty
and searched for TestGroup
. Of all these libraries, 7 use TestGroup
directly: tasty-bench
, tasty-expected-failure
, tasty-silver
, tasty-inspection-testing
, tasty-focus
, tasty-focus
, and tasty-bdd
. Of these, 6 would fail silently as their usage falls in one of the two patterns mentioned. Only tasty-inspection-testing
would continue to be correct.
I'm proposing to let static typing do its thing to prevent silent breakage.
@Bodigrim I've applied your suggestions in separate commits. I'll squash them into the first two if you're happy with them.
@Bodigrim Again thanks for all the comments, and sorry for the delay on this one. There are two unresolved comments now. I've been unable to come up with a strategy to join foldTestTree
and the fold in createTestActions
- I couldn't find an implementation/type for foldResource
that worked out. As discussed before, I don't think anyone but Tasty itself uses this field, so I don't think this is a major issue.
@VictorCMiraldo @Bodigrim Anything I could do to get this PR going again? :)
@martijnbastiaan I'm sorry, I've lost all the context since August 2022 and I don't have time to restore it at the moment. I'll try to review in upcoming weeks.
Yeah no worries, my fault for being a slowpoke. I'll brush up the cover letter with an overview of the history of this PR today.
@martijnbastiaan could you please rebase?
I managed to make it work without the custom traversal in Run.sh
and with only one very minor API change. I need some more time to brush it up for review. I hope to publish that this weekend.
Thanks @Bodigrim. Regarding the commit:
I found the stack of Traversal
/Reader
/Writer
/IO
+ monoidal combination to be very cumbersome to work with. Given that it also had some complexity issue I decided to refactor it into the current structure (see the commit message for more information). Overall I'd say the current approach is an improvement, but if there's disagreement I'll try my best to make it work in the original structure.
If it helps; we could also split this PR up into multiple (cleanup + foldGroup
API change + refactor createTestActions
+ this).
Let's pull out refactoring to its own PR please.
Okay, done: https://github.com/UnkindPartition/tasty/pull/364.
@Bodigrim All done :)
@martijnbastiaan I'm truly sorry that it takes so long, I'm very much overloaded recently.
No worries, I appreciate the feedback. If it would be appreciated: I could help out reviewing PRs for this project and doing some general maintenance.
Thanks for persistence, @martijnbastiaan :)
tasty
is a fairly stable package, so before we cut a new major release, I'd like to resolve as many other issues as possible. #341 and #311 are probably closest to completion, and #231 and #342 seem actionable too.
If it would be appreciated: I could help out reviewing PRs for this project and doing some general maintenance.
@martijnbastiaan much appreciated! Sorry, I don't have admin rights to grant you formal privileges (and I'm reluctant to distract @UnkindPartition by asking), but you are most welcome.
I read about his situation, best not to distract him yeah. I'll see what I can do for the listed issues/PRs.
Tasty 1.2 introduced a way for tests to specify dependencies. That is, what tests should run before themselves. These dependencies are specified using an AWK-like expression annotated to a
TestTree
. These expressions can match against any test in the fullTestTree
. This approach has a few rough edges:Any pattern has to be tested against any test in the tree. If your test tree mostly consists of tests specifying dependencies for each other, this means calculating your test tree is of quadratic complexity.
It's easy to introduce dependency cycles, or other mistakes introducing needless sequentiality. The latter being especially insidious as it easily goes unnoticed.
This commit introduces the ability to specify dependencies by using a combinator taking a list of
TestTree
s. This way of specifying dependencies removes the quadratic complexity in favor of a linear one, while eliminating the ability to accidentally introduce cycles or unintended sequentiality.A bit of history for this PR as it has undergone several rewrites thanks to a few great reviews I've gotten:
The PR originated from friction experienced over at
clash-lang/clash-compiler
where it was hard to define tests sequentially in a programmatic manner and startup of the testsuite now takes between 30-60 seconds. See this comment.The initial version added an
AfterTree TestTree TestTree
constructor, and would build a unique name for each test in the tree. Using aTrie
it would then be able to relatively cheaply find dependencies of test cases. As a bonus, it would solve 3 out of 4 issues documented in the README for dependencies defined using patterns too. However, the implementation complexity of this was deemed to high to be merged.The current implementation does not try to fix any problems with the existing way of defining dependencies, but instead focuses on just doing that for the new way. The new way ~~being an extra field added to
TestGroup
, which indicates whether tests should run in parallel or sequentially. To implement this properly, two backwards incompatible changes had to be made as documented over here.~~ marks test trees as sequential usingIsOption ExecutionMode
. In order to implement sequential executionfoldGroup
had its type signature changed fromfoldGroup :: OptionSet -> TestName -> b -> b
tofoldGroup :: OptionSet -> TestName -> [b] -> b
, implemented in https://github.com/UnkindPartition/tasty/pull/364.