Open Mikolaj opened 2 years ago
It's not the assumption I was using :) Remember this comment, in which I said that I think the solver is allowed to decide to pick a plan which does not include tests even if one exists. For example, if suppose the solver is both allowed to decide whether to enable tests and whether to enable the benchmarks. The solver might try four combinations:
If the solver tries them in that order and picks the first one which succeeds, you'll observe the current bug if 1 fails and 2 and 3 succeed, because 2 is picked even though 3 succeeds. And we can't fix this by checking 3 before 2, because then we'd just have the same bug but for benchmarks!
The reality is much worse, because there are many more dimensions of choice, such as the version of every recursive dependency. So many that it's not feasible to check every allowed plan in which the tests are enabled. Instead, a heuristic must be used, to examine more promising branches of the possibility tree before others. And if that heuristic makes it look like disabling the tests is a promising avenue, the solver can easily find a plan in the disabled-tests branch without having completely explored the enabled-tests branch, and thus choose to disable the tests even if a successful plan exists somewhere in that unexplored enable-tests branch.
@gelisam: you are right and I even responded to that comment, but then promptly repressed that, because it's too painful to believe. In fact, the cabal's blurb in my bug's description is correct and takes into account the possibility of solver not cooperating and that's exactly the blurb you changed in #7834. So, #7834 and related are not threatened.
So the problem in my case is that I have 1 library, 1 exe, 1 testsuite and the solver is not certain that the exe and the test are not excluding each other? And so it picks a plan where exe can be built and ignores the testsuite?
Could we tweak cabal to prioritize tests (respectively, benchmarks) over exes and libraries and everything, when cabal test
is invoked (respectively, cabal benchmark
)?
@grayjay: do you think it's easy to ask the solver to do that and cabal just fails to ask, or is solver not able to prioritize things, apart of the brute force mode where it tries exhaustively and fails, if it can't build something (e.g., --enable-tests)?
Could we tweak cabal to prioritize tests
There is a very clear description of the algorithm in this Well-Typed post on Qualified Goals in the Cabal Solver. There is a very clear phase where such preferences are applied: the "Heuristics and Preferences " section in which the tree branches are reordered. I think it would be instructive to create a variant of your repro case in which the search tree is small enough that we can make diagrams like the ones in the post, so we can see at which node the choice of enabling tests or not is made, and discuss at which node we would like to make that choice instead.
I think this issue is the same as #5079. The problem is that, by default, cabal runs the solver with only a preference to enable tests and benchmarks. The preference is only applied locally by sorting nodes in the search tree, which means that the solver may disable tests in order to satisfy a different preference, such as the preference to avoid deprecated versions. We could apply preferences globally by implementing a feature like #2860 and give the testing preference higher priority, but it would only reduce the frequency of the current issue, not prevent it. There would still be cases where the solver silently disables tests without --enable-tests, such as the conflict between tests and benchmarks in https://github.com/haskell/cabal/issues/7883#issuecomment-1002072947 or a conflict between two test suites.
Could we tweak cabal to prioritize tests (respectively, benchmarks) over exes and libraries and everything, when cabal test is invoked (respectively, cabal benchmark)?
I think we should continue using the same default solver options for both "build" and "test" in order to avoid unstable install plans. I think that it would be very confusing for users to get completely different versions of dependencies depending on the cabal command that was run.
Doh, thank you very much for the explanations. I apologise, I couldn't get around to generating that small test case and tried to do that before responding. But I feel this is an important usability point, so I don't want to just close as a duplicate.
I think we should continue using the same default solver options for both "build" and "test" in order to avoid unstable install plans. I think that it would be very confusing for users to get completely different versions of dependencies depending on the cabal command that was run.
@grayjay: do you mean that if "test" can't be solved using a superset of dependencies generated for "build", it always fails (without --enable-tests
)? That guarantee would indeed be worth something, but only if the user can depend on it. Or am I over-interpreting?
do you mean that if "test" can't be solved using a superset of dependencies generated for "build"
The question implies that cabal build
runs the solver to pick versions of the dependencies of lib and exe targets, then builds them, while cabal test
runs the solver to pick versions of the test targets, then builds and runs them. This is not how those commands work. If they did (I think they once did?), then running cabal build
followed by cabal test
would likely pick different plans (what@grayjay refers to as an "unstable install plan") and thus end up building dependencies twice. Or if like you suggest, subsequent invocations of the solver only pick plans which are compatible with previous invocations, then which plan you get would depend on whether you start by running cabal build
or cabal test
, which is a project-specific version of the so-called "cabal hell" situation we had before the v2 commands.
Since both of those behaviors are undesirable, what is actually happening is that all the commands (except cabal-install) use the same plan. So it doesn't matter if you call cabal build
or cabal test
first, the solver will always look for a plan which involves all the libraries, tests, exes, and benchmarks.
I was careful to say "involves", not "includes", because a plan does not just pick version numbers for the dependencies, it also makes decisions about flags, whether to enable tests, etc. That's how we nevertheless manage to get plans which don't include the tests.
OK, you convinced me we are doing the right thing. I take back my proposal to try harder to include tests when the solver notices it's run from cabal test
. That's a bad idea.
I guess, that leaves me with a request to try harder (ideally guarantee) that the plan we find is a local maximum in terms of how many components are included. E.g., in my (non-minimized) real life case above, it was possible to add a testsuite (without resorting to changing flags or picking deprecated deps), but the chosen plan included none. Local maximum in the sense, that if it's possible to add one more component (without removing any that are already included) by changing only dependencies (and not to deprecated versions) then it's added. Perhaps what that entails is that the preference to include even a single new component should be stronger than to keep optimal dependencies (newest possible, or already found), even if that means downgrading absolutely all dependencies to stone age versions (barring deprecated versions).
Perhaps that's impossible, though. I'm reading the blog post and it underlines that decisions are local, while I require global guarantees, which may imply an exhaustive search/backtracking. I think the solver never backtracks over preferences, e.g., a preference to not exclude a component? It only backtracks over constraint-like things, e.g., the tests: True
hard requirements.
Or how bad would it be to treat component inclusion as hard constraints? (The difference vs tests: True
would be that the latter causes tests to be actually built.) Or perhaps we can let the user decide, via tests: Plan
, that the tests have to be included in the plan, at the cost of as many backtracks, as needed? @gelisam: did our brainstormimg about a new value for the tests:
setting touch this (it's been a long time)? That would an acceptable setting for a user that runs cabal build
1000 times a day and cabal test
10 times a day. Without it, a user for whom the default plan excludes tests, is forced to either switch plans and rebuild every time the user runs tests or force tests: True
and slow down every error-checking build. In large projects, that's a significant cost.
Please tell me how I'm mixing up things again. ;)
@gelisam: did our brainstormimg about a new value for the tests: setting touch this
The new value does not change the current behavior in any way, it just better documents the fact that they are three options rather than just True
and False
.
Or how bad would it be to treat component inclusion as hard constraints?
If we limit our attention to plans which include all the components, then Haskellers will be happier because they are the ones who tend to be surprised by the current behaviour, but non-Haskellers, who need the most help, will be greated with an inscrutable "cannot find build plan" error more often, because the solver will not be allowed to work harder by looking for a build plan which can build the executable they care about but not the tests they don't know how to run anyway.
Instead of adding even more values to tests
, how about splitting the field into several smaller ones? One field to determine if the tests should be included in the plan, and one to specify which components are included in the default all:all
target. Speaking of which, you can run cabal build all:libraries
if you don't want to build the tests.
If we limit our attention to plans which include all the components, then Haskellers will be happier because they are the ones who tend to be surprised by the current behaviour, but non-Haskellers, who need the most help, will be greated with an inscrutable "cannot find build plan" error more often, because the solver will not be allowed to work harder by looking for a build plan which can build the executable they care about but not the tests they don't know how to run anyway.
But aren't newcomers mostly running cabal install package-name
or sometimes cabal install .
? Do I remember right cabal install
doesn't try to find a build plan with tests and benchmarks?
Instead of adding even more values to
tests
, how about splitting the field into several smaller ones? One field to determine if the tests should be included in the plan, and one to specify which components are included in the defaultall:all
target.
Agreed.
Speaking of which, you can run
cabal build all:libraries
if you don't want to build the tests.
Good to know, thank you. But in a week all I remember will be 'cabal build' again. ;) OTOH, if I stick something that corresponds to that in my cabal.project.local
, it will still work and for new projects I will remember to look it up in my old projects (I always keep a lot of cabal.project.local
variants, with different suffixes). So IMHO it's worth it to try to match commandline and config in this case, even partially.
Oh right, good point about cabal install
using a different plan! In that case, I now agree that tests: True
should be the default, and perhaps even the only option.
Instead of adding even more values to tests, how about splitting the field into several smaller ones? One field to determine if the tests should be included in the plan, and one to specify which components are included in the default all:all target.
I like this idea. If I understand correctly, the first field is exactly the same as the existing boolean tests
field, and the second field only chooses what is built, so it doesn't need to be an input to the solver.
Oh right, good point about cabal install using a different plan! In that case, I now agree that tests: True should be the default, and perhaps even the only option.
I also like this idea. In my opinion, we should remove the third value, where tests are only enabled with a preference. I think that the testing preference adds complexity without much benefit, because it doesn't make any useful guarantees (and can't make useful guarantees without behaving like a constraint). Setting the default to true will give cabal test
the expected behavior. We can also make cabal build
give an error that suggests --disable-tests
when dependency solving fails. I think that this all applies to enabling benchmarks, too.
It sounds like we're iterating to the idea that the default is True
and there's no third choice, for tests and benchmarks both. And then we need to add a warning furthermore to suggest disabling the flag if no plan can be found, and finally need to have the --enable-tests and --enable-benchmarks flag not force building tests or benchmarks, but only to include them in the plan, and have them only built when explicitly requested as components or when running cabal test
or cabal benchmark
.
@gbaz; I'm happy we all agree about the outline of the solution, even though it's not backward compatible, because we remove the current default (the non-binding promise to try and include stuff in the plan), so it can't be the defaults any more.
Regarding the details, @gelisam and me thought about an extra field in the project file that would mean "build it" so that the tests: True
field has the same meaning as the new meaning (only plan) of --enable-tests
and tests: False
as --disable-tests
, while the new field would force building the component (I fully expect I'm misrepresenting @gelisam's position again :). Is that a good idea? In particular, is making tests: Tests
and --enable-tests
the same a good idea and, independently, do we need another field in the project file for the building bit, e.g., to let users migrate easily from tests: True
to the new setup?
is making
tests: Tests
and--enable-tests
the same a good idea
Aren't they already the same?
I don't have an opinion as to whether it is better to add a flag changing the meaning of all:all
or to ask users to type cabal build all:all all:tests
.
Aren't they already the same?
They are, but @gbaz proposes to make --enable-tests
equivalent to the new default, so its semantics would change. Or should we keep --enable-tests
and --disable-tests
unchanged, that is, forcing both the planning and the building? Keeping things as they were is good, but do we end up with less power?
Let me think. Can we express
--disable-tests
.--enable-tests
.Seems to check out. The only thing missing is overriding "tests: True" ("plan, build") on command line to "plan, don't build". Is that important?
I like the simplicity of having --enable-tests
/--disable-tests
mean the same thing as tests: True
/tests:False
. I support @gbaz 's summary of what now needs to be implemented, and since I am now quite familiar with the tests
and --enable-tests
code, I volunteer to implement it (eventually).
With that approach, we can still express all the combinations:
cabal build
or cabal build --enable-tests
cabal build --disable-tests
cabal build all:tests
or cabal test
(the latter also runs the tests)The only thing missing is overriding "tests: True" ("plan, build")
Wait, what? I thought we all agreed that tests: True
should now mean "plan, don't build", not "plan, build". Could you restate what the various combinations of tests: True
/tests: False
, --enable-tests
/--disable-tests
, and cabal build
/cabal test
do in your version of the proposal in which --enable-tests
/--disable-tests
keep their current semantics but tests: True
/tests: False
don't?
I realize that I'm asking you for quite a number of combinations, so I should be willing to do that amount of work myself. Here is the behaviour I expect for the proposal I support:
All of the following combinations have the effect "plan libs, exes, tests, and benches; build libs and exes":
tests
entry, cabal build
tests
entry, cabal build --enable-tests
tests: True
, cabal build
tests: True
, cabal build --enable-tests
tests: False
, cabal build --enable-tests
All of the following combinations have the effect "plan libs, exes, tests, and benches; build and run tests":
tests
entry, cabal test
tests
entry, cabal test --enable-tests
tests: True
, cabal test
tests: True
, cabal test --enable-tests
tests: False
, cabal test --enable-tests
All of the following combinations have the effect "plan libs, exes, and benches; build libs and exes":
tests
entry, cabal build --disable-tests
tests: True
, cabal build --disable-tests
tests: False
, cabal build
tests: False
, cabal build --disable-tests
All of the following combinations have the effect of failing with an error message explaining that tests are disabled:
tests
entry, cabal test --disable-tests
tests: True
, cabal test --disable-tests
tests: False
, cabal test
tests: False
, cabal test --disable-tests
I thought we all agreed that
tests: True
should now mean "plan, don't build", not "plan, build".
I don't oppose that, but in that case the user can't request via a project file to have tests planned, built (type-checked) and not run via any subsequent cabal build
invocations. And that's useful and that was expressible previously, so there is no migration path if the user had "tests: True" to that effect in a project file (in CI, say). Is my motivation clear?
Is my motivation clear?
Yes, thanks for explaining. Given that motivation, I think it's important to list the current behaviour under all the combinations, to make sure all those behaviours can easily be obtained under the new proposal.
All of the following combinations have the effect "plan libs and exes, and try to include tests if possible; build libs and exes":
tests
entry, cabal build
All of the following combinations have the effect "plan libs and exes; build libs and exes":
tests
entry, cabal build --disable-tests
tests: True
, cabal build --disable-tests
tests: False
, cabal build
tests: False
, cabal build --disable-tests
All of the following combinations have the effect "plan libs, exes, and tests; build libs, exes, and tests":
tests
entry, cabal build --enable-tests
tests: True
, cabal build
tests: True
, cabal build --enable-tests
tests: False
, cabal build --enable-tests
All of the following combinations have the effect "plan libs and exes, and try to include tests if possible; build and run tests, failing if the tests were not included in the plan":
tests
entry, cabal test
All of the following combinations have the effect "plan libs, exes, and tests; build and run tests":
tests
entry, cabal test --enable-tests
tests: True
, cabal test
tests: True
, cabal test --enable-tests
tests: False
, cabal test --enable-tests
All of the following combinations have the effect "plan libs and exes; fail because tests are disabled":
tests
entry, cabal test --disable-tests
tests: True
, cabal test --disable-tests
tests: False
, cabal test
tests: False
, cabal test --disable-tests
Here's the subset of those behaviours we can accomplish with the proposal I support. All the commands assume that tests
is either unspecified or set to its new default, True
.
cabal build --disable-tests
cabal build all all:tests
cabal test
cabal test --disable-tests
I think your only complaint about the above is that cabal build all all:tests
is hard to remember, and that you'd rather have a ~/.cabal/config
setting you can set and forget?
Actually, I didn't even remember one can set it in ~/.cabal/config
, but that's a good point. I meant cabal.project
and especiallycabal.project.local
(that people set via cabal configure
).
You are right, this is my only complaint (if you generalize to project files and ignoring benchmarks, which behave analogously).
To reword: the biggest complaint is that if somebody had tests: True
in cabal.project
and various reasonable commands in CI, that person can't migrate (that is, obtain the same behaviour as before) by only changing something in cabal.project
. Adding arguments to (possibly many) invocations of cabal in the CI is required. Compare with the alternative, where that person doesn't need to change anything anywhere for that use case. But regardless of the alternative, the new setup has a relatively less expressive cabal.project
--- you can't permanently force building tests in cabal.project
any more (and so you can't force it via cabal configure
either, etc.)
The minor and last complaint is that a common idiom "typecheck everything" that previously required only (tests: True
and cabal build
) or only (cabal build --enable-tests
) now requires the scary cabal build all all:tests
.
I wanted to mention that there is one advantage to continuing to use three values for the "tests" configuration option over splitting it into two independent options (other than backwards compatibility). The three value design avoids the possibility of setting the "plan tests" option to false and the "build tests" option to true, which is a combination that doesn't make sense. In the three value design I am imagining, tests: True/--enable-tests
and tests: False/--disable-tests
would remain unchanged and the third (default) value would change to mean "plan tests, don't build tests". We would also name the third value as in #7829.
Ooh, making invalid states unrepresentable, I like that!
Let me again exhaustively list the behaviours for this proposal.
All of the following combinations have the effect "plan libs and exes; build libs and exes":
tests
entry, cabal build --disable-tests
tests: False
, cabal build
tests: False
, cabal build --disable-tests
tests: True
, cabal build --disable-tests
tests: OnlyPlan
, cabal build --disable-tests
tests: OnlyPlan
, cabal test --disable-tests
All of the following combinations have the effect "plan libs, exes, and tests; build libs and exes":
tests
entry, cabal build
tests
entry, cabal build --only-plan-tests
tests: False
, cabal build --only-plan-tests
tests: True
, cabal build --only-plan-tests
tests: OnlyPlan
, cabal build
tests: OnlyPlan
, cabal build --only-plan-tests
All of the following combinations have the effect "plan libs, exes, and tests; build libs, exes, and tests":
tests
entry, cabal build --enable-tests
tests: False
, cabal build --enable-tests
tests: True
, cabal build
tests: True
, cabal build --enable-tests
tests: OnlyPlan
, cabal build --enable-tests
All of the following combinations have the effect "plan libs, exes, and tests; build and run tests":
tests
entry, cabal test
tests
entry, cabal test --enable-tests
tests
entry, cabal test --only-plan-tests
tests: False
, cabal test --enable-tests
tests: False
, cabal test --only-plan-tests
tests: True
, cabal test
tests: True
, cabal test --enable-tests
tests: True
, cabal test --only-plan-tests
tests: OnlyPlan
, cabal test
tests: OnlyPlan
, cabal test --enable-tests
tests: OnlyPlan
, cabal test --only-plan-tests
All of the following combinations have the effect "plan libs and exes; fail because tests are disabled":
tests
entry, cabal test --disable-tests
tests: False
, cabal test
tests: False
, cabal test --disable-tests
tests: True
, cabal test --disable-tests
This is brilliant. Let's do it.
See #5079 for previous discussion of this issue.
@ulysses4ever's exec summary:
this issue (and its variations) has been discussed on multiple occasions (#5079, #3923)
a desirable converging point in the design space seems to be, as of now (as put by grayjay):
the main counterpoint to the idea above (as put by fgaz):
it feels like we get a better UX if we follow the idea still, and the answer to the counterpoint (as put by Mikolaj):
anecdotally, many power users already put
tests: True
in their global config and deal with the above mentioned problem on the case by case basis; we just enshrine this practice in the default;our best hope for implementing something like that is, currently, #7829 but it looks like it needs a new champion...
Describe the bug
I'm dog-fooding current dev cabal from master branch.
Solver is not cooperating. Cabal says "solver picked a plan that does not include the test suites" despite there being such a plan, which can be seen when forcing with 'tests: True'.
To Reproduce
Clone
https://github.com/Mikolaj/mostly-harmless/commit/3feb586519b6fbe87aa1867700206b1af0bd600f
Run the following (it seems, a fresh package store is not required after all)
It shows
but
cabal build -w ghc-9.2.1 --enable-tests
(as well ascabal test -w ghc-9.2.1
with 'tests: True' incabal.project.local
) goes through fine and tests run. However, after removingcabal.project.local
,cabal test -w ghc-9.2.1
fails again the same.The same attempts with ghc-8.10.7 work fine.
Expected behavior
The solver should pick the right plan without forcing. Forcing would ideally be used only to show the error that prevents the solver from finding a plan. That's the assumption under which ~@gelisam and~ me operated in #7834 and related tickets. Now it's all put into question.