grayjay commented 8 years ago

I don't think there is a clear design for component-based solving yet, so I wanted to create an issue where we can discuss it. #1575 addresses circular dependencies in test suites but is moving towards a different solution.

Related issues:

non-existent components

1575 - cabal can't handle cycles between packages that are not cycles between components

2725, #5413 - cabal requires dependencies of components that aren't being built

3662 - more per-component support

3263 - fine-grained dependencies

grayjay commented 8 years ago

Here are my thoughts, so far. At a minimum, the solver should keep track of dependencies between packages' components. (I think that it currently tracks dependencies from components to packages.) Then it can stop requiring dependencies of components that aren't needed, and ensure that required components are buildable. The solver should still ensure that all components within a package are consistent (have the same version and flag assignments), unless they have different qualifiers. Maybe we can add a flag or use private dependencies to relax that restriction later.

ezyang commented 8 years ago

@grayjay, thanks for making this ticket. Your plan seems eminently reasonable. Go for it!

grayjay commented 7 years ago

Here is a first pass at a design:

Summary

This design isn't 100% component-based: We keep package variables and add one boolean variable for each component. However, the nodes of the dependency graph are components.
We transform packages' dependency trees to handle all combinations of setup Cabal version, Buildable field value, component enabled/disabled, and --enable-tests/benchmarks. This design would also fix #3881.
We add several search tree transformations to enforce new component-related constraints.

Tree structure

Per-component solving requires some changes to the variables used by the solver, which means that we also need to change the goals, as well as the choice nodes created for those goals.
Variables
Types of variables (including those that already exist):
- Package:
- The instance chosen for a specific package, as before.
- Flag:
- The boolean value for a specific flag, as before.
- Flag-like (I'm not sure whether these should be implemented as flag variables, stanza variables, or something new):
- Each has two possible values.
- They are used in conditionals to control dependencies, just like flags, except that they aren't in the .cabal file. They need to be added when the solver converts the package indexes to the solver-specific index format.
- Three types of flag-like variables per package:
  - Tests/Benchmarks enabled:
  - Two variables per package, just like current test and benchmark stanza variables.
  - The default value is disabled.
  - Component on/off:
  - There is one variable per component in the package.
  - The two choices are ~~Required and Not Required~~ On and Off.
  - ~~Required~~ On:
    - It means that this component must be built. Either it is a top-level target, or it is a dependency of another component.
    - Buildable: False is treated as a failure.
  - ~~Not Required~~ Off:
    - The component does not need to be built. It may be built anyway, because of an old Cabal setup version or --enable-tests/benchmarks.
  - The default value is off.
  - Setup script features:
  - There is only one variable per package. The two choices are New and Old.
  - New:
    - It means that the package has a simple build type or uses Cabal >= 1.24 for the Custom build type.
    - The solver can choose to build a subset of components.
    - The solver can ignore dependencies of unbuildable components.
  - Old:
    - It means that the package can use Cabal < 1.24 for the Custom build type.
    - The solver must choose dependencies for all lib/exe components, though the components are allowed to have Buildable: False.
    - The solver must also choose dependencies for all test (or benchmark) components when tests (or benchmarks) are enabled, though they may also be unbuildable.
    - Dependencies are required for unbuildable components (#1725).
  - The default value is "new".
  - TODO: How do we know when a setup script supports per-component builds? Do we only need the setup Cabal version and build type?
Why do we still have package variables?
- Per-component solving needs to work with Custom setup scripts that don't support per-component builds.
- Constraints from one component should still apply to others. Otherwise, we might break packages that only constrain dependencies for one component (usually the library).
- --enable-tests/benchmarks needs to find all tests/benchmarks in the package.
- We only need to choose the package instance once for all components. If we ever need to build components with conflicting dependencies, we can use qualified goals.
Each time we choose a package instance while building the tree, we immediately add all flag-like goals for that instance to the set of remaining goals. TODO: How do we handle component goals for installed packages? Do we know which executables are available? Can we find all components by package name and version in the installed package index?
We add package goals as we discover new dependencies while building the tree, as before.

Index conversion

This phase adds flags to the dependency tree for each package, in order to implement the behavior of the flag-like goals described above.
This isn't a big change in design. The solver already transforms the dependency tree in order to ignore dependencies of unbuildable components (#3039).
Exe/lib component transformation

Say x represents the contents of an executable stanza (The same applies to libraries). x would be transformed into:

executable my-exe
  if flag(my-exe-component-on)
    x, with "Buildable: False" fields converted to failures
  else
    if flag(old-setup-script-features)
      x
    else
      x, transformed as in #3039 (in order to ignore dependencies when
      the component is unbuildable) and all dependencies converted to
      qualified constraints

Test/benchmark component transformation:

x represents the contents of a test suite stanza. The transformation is like the lib/exe transformation, except that it also uses the tests-enabled flag.

test-suite my-test
  if flag(my-test-component-on)
    x, with "Buildable: False" fields converted to failures
  else
    if flag(tests-enabled)
      if flag(old-setup-script-features)
        x
      else
        x, transformed as in #3039 (in order to ignore dependencies when
        the component is unbuildable) and all dependencies converted to
        qualified constraints
    else
      empty test-suite stanza

Dependency graph

Nodes are components, not packages.
We can choose whether to check for cycles between packages or components.
TODO: Do we want to prevent cycles between components but allow cycles between packages? It would fix #1575, but it may be a worse fix than #3422. (It wouldn't allow a test framework to depend on an installed version of the package under test.)

Custom setup validation steps

We add two search tree transformations to ensure that the value of the "setup script features" variable is consistent with the actual features of the setup script. We could combine these two transformations if it turns out to be simpler.
One transformation disables the "New" "setup script features" choice for each package that has Custom build type and a setup Cabal version that is less than 1.24.
The other transformation enforces Cabal >= 1.24 for the setup script for each package that chose "New" for the "setup script features" variable.
(There are two transformations because the solver can assign the variables in either order.)

Component dependency validation steps

We add two more tree transformations. These could be combined with the existing validation phase transformations.
Note that we need to perform two checks for each dependency on a component: The package instance must contain the right component, and the component must be enabled.
First transformation:
- Create a set of required components while descending the search tree, by looking at build-depends, tool-depends, and build targets.
- For each package instance choice, prune all instances that don't contain all of the components that are required so far.
- For each component choice, if the component is required, prune the "Off" choice.
Second transformation:
- Create a set of visited packages and a set of available components while descending the search tree:
- For each package instance choice, add the package to the package set and add all components provided by that instance to the component set.
- For each component choice that turns off a component, remove that component from the component set.
- Every time a dependency is introduced by a build-depends or tool-depends, check that if the package was visited then the component is available. If the component is not available, prune the choice that introduced the dependency.

Enabling tests/benchmarks

--enable-tests/benchmarks doesn't need to change much. The current stanza variables have the same behavior as the flag-like variables I described above.

Internal dependencies

The solver also tracks dependencies between a single package's components. A dependency between components is handled as a dependency on an exact package version, so that both components come from the same package instance.

More questions

What if a user specifies an unbuildable component as a build target? The solver could try to build the component by flipping flags, but then the install plan would depend on the list of targets. This problem is very similar to #3923, except that it deals with enabling components instead of tests.
What if a user specifies a test from a Custom package that uses Cabal < 1.24? Should we just translate it to --enable-tests? This problem also relates to
3923.

/cc @kosmikus @ezyang @Ericson2314 @dcoutts @edsko @hvr @23Skidoo

ezyang commented 7 years ago

Thanks, this looks great.

How do we know when a setup script supports per-component builds? Do we only need the setup Cabal version and build type?

Per-component occurs if and only if:

We are new-build
build-type is not Custom
cabal-version >= 1.8
--disable-per-component is not specified

The canonical source of truth is why_not_per_component in ProjectPlanning. Probably best to just cross-reference two sites for now.

Constraints from one component should still apply to others. Otherwise, we might break packages that only constrain dependencies for one component (usually the library).

But note that to fix things like #3732 we'll have to relax how we do the constraints here. I guess this is covered by what you say later: "If we ever need to build components with conflicting dependencies, we can use qualified goals."

How do we handle component goals for installed packages? Do we know which executables are available? Can we find all components by package name and version in the installed package index?

At the moment, we won't know anything about executables, but we don't have to, because new-build will never have those as "installed" packages.

Internal libraries pose a special problem, because they show up in the installed package index (I recently committed a hack to handle this case). With our current featureset, we are allowed to assume that any internal library in the installed package index can only be reached by a corresponding "public" library from the package (so the "correct" way to handle internal libraries is to bundle them all up together with the public library into one 'package node'). The hacked code doesn't actually do this but that's what should be done morally.

What if a user specifies an unbuildable component as a build target? The solver could try to build the component by flipping flags, but then the install plan would depend on the list of targets.

Perhaps there should be a distinction made between "build targets" (what am I going to build this round) and "solver targets" (please work hard to dep solve so that X is buildable.) The latter is stable and shouldn't change even if you decide to build different things. This should solve the next bullet point too.

grayjay commented 7 years ago

Thanks for the feedback!

Per-component occurs if and only if:

We are new-build build-type is not Custom cabal-version >= 1.8 --disable-per-component is not specified

I see that I had this part backwards. These rules simplify things, because the solver doesn't need to choose whether to use per-component build for each package; there is only one choice. In that case, I think that we don't need to add the "setup script features" variables, for now.

I had also incorrectly assumed that every package that ignores dependencies of unbuildable components also supports per-component build. Since the two attributes are not linked, I think we should handle #3881 separately.

But note that to fix things like #3732 we'll have to relax how we do the constraints here. I guess this is > covered by what you say later: "If we ever need to build components with conflicting dependencies, we can use qualified goals."

I'm not sure that we should make cabal find a solution for the example at the top of #3732, which only constrains base in one component. It looks the same as a package where the author meant for the constraint on base to apply to all components but wanted to avoid duplication. I haven't searched Hackage to see how common that pattern is, though.

Per-component solving would still allow more flexibility in less extreme cases, even if it always applied all build-depends constraints. Take this example:

executable exe1
  build-depends: x > 2

executable exe2
  build-depends: y < 3

Say that x > 2 and y < 3 have conflicting constraints on z. Then per-component solving would allow exe1 and exe2 to be installed separately while the current solver wouldn't find a solution.

Qualified goals would only help by allowing components that were independently solvable to be solvable together with different qualifiers (possibly using different versions of the package containing the components).

At the moment, we won't know anything about executables, but we don't have to, because new-build will never have those as "installed" packages.

Okay, so cabal will behave as if those package instances provided libraries but not executables.

Internal libraries pose a special problem, because they show up in the installed package index (I recently committed a hack to handle this case). With our current featureset, we are allowed to assume that any internal library in the installed package index can only be reached by a corresponding "public" library from the package (so the "correct" way to handle internal libraries is to bundle them all up together with the public library into one 'package node'). The hacked code doesn't actually do this but that's what should be done morally.

I'm not sure I understand this part. Can we always reach all of the internal libraries if we have the main library from the installed package index? I think that is all the solver needs to do.

Perhaps there should be a distinction made between "build targets" (what am I going to build this round) and "solver targets" (please work hard to dep solve so that X is buildable.) The latter is stable and shouldn't change even if you decide to build different things. This should solve the next bullet point too.

That distinction makes sense. I still don't know how we should choose those two sets of targets, though.

For "solver targets", maybe we should just give the solver a list of packages to solve for and apply a preference to build all components within each of those packages. Is there any situation where we would want to give the solver a subset of a package's components as targets?

hvr commented 7 years ago

re

I'm not sure that we should make cabal find a solution for the example at the top of #3732, which only constrains base in one component. It looks the same as a package where the author meant for the constraint on base to apply to all components but wanted to avoid duplication.

IMO a better way to avoid duplication is to make it more explicit that multiple components ought to share common dependencies/constraints via the likes of #2832, and then we could unlock/enable per-component solving also in the aforementioned case when a high enough cabal-version: value is set (c.f. to how the semantics for build-depends & multiple components changed between cabal-version 1.6 and 1.8)

ezyang commented 7 years ago

I had also incorrectly assumed that every package that ignores dependencies of unbuildable components also supports per-component build.

Yeah, but I think the converse is true (per-component = correct handling of non-buildable components.)

I'm not sure I understand this part. Can we always reach all of the internal libraries if we have the main library from the installed package index? I think that is all the solver needs to do.

Not necessarily. For example, we may have registered an internal library which was only linked by the test suite, but not by the actual public library of a package. @dcoutts would probably say that this just means we shouldn't have registered it but it seems better to me to not assume any relationship without tracing the dependency relations.

For "solver targets", maybe we should just give the solver a list of packages to solve for and apply a preference to build all components within each of those packages. Is there any situation where we would want to give the solver a subset of a package's components as targets?

Yeah I'm not sure what the default should be. What we have right now is best effort which seems to work decently well, so maybe every component would have a designation like, "never", "best-effort", "always", with everything defaulting to best-effort.

grayjay commented 6 years ago

After #4884 and #5304, the solver stores the two components that are involved in each dependency. The dependent component is a D.Solver.Types.ComponentDeps.Component, and the depended upon component is a library or executable. The two PRs also implement the part of "Component dependency validation steps" from https://github.com/haskell/cabal/issues/4087#issuecomment-287665861 that checks for the existence of components, though the solver doesn't track whether components are enabled yet.

phadej commented 5 years ago

Lack of this feature required to split of test-suites and benchmarks from containers. And probably will need to do the same exercise for time. Also I cannot use QuickCheck in the tests of splitmix anymore, as QuickCheck depends on splitmix. This list grows as v2-build gets adapted.

IMO, not having this in 3.0 makes v2-build unconvinient for packages deep down in the dependency tree.

Is there anything one can do to make this happen sooner, than later.

Ericson2314 commented 5 years ago

I would be very happy to teach Cabal cross immediately if only this were done, too.

grayjay commented 5 years ago

I'm less familiar with the change that would be required to allow cycles between packages, but I think it could be implemented separately from the other parts of this feature, like the changes to flag variables. Unless I'm missing something, it would mainly require changes to cycle detection and the conversion from the solver's dependency graph to the one used outside of the solver:

Change the solver's dependency graph format from a graph with packages as nodes and components as edges to a graph with components as nodes: https://github.com/haskell/cabal/blob/9c31d9c760468719c881d0f137c66e62b02a1164/cabal-install/Distribution/Solver/Modular/Dependency.hs#L256
Add edges between components in the Build phase. The components on both sides of the dependency are available in the FlaggedDep, and setup dependencies are already handled as a special case: https://github.com/haskell/cabal/blob/9c31d9c760468719c881d0f137c66e62b02a1164/cabal-install/Distribution/Solver/Modular/Builder.hs#L76
Change the cycle detection function to check for cycles involving any component in the current package: https://github.com/haskell/cabal/blob/9c31d9c760468719c881d0f137c66e62b02a1164/cabal-install/Distribution/Solver/Modular/Cycles.hs#L52 Should we only relax cycle detection for tests and benchmarks for now?
Keep the dependencies between components when converting from the solver's dependency graph format: https://github.com/haskell/cabal/blob/9c31d9c760468719c881d0f137c66e62b02a1164/cabal-install/Distribution/Solver/Modular.hs#L77-L78

I'm not sure how much code outside of the solver will also need to be changed to maintain the dependencies between components and ensure that components are built in the correct order. I'm also not sure if there will be other issues with having cycles between packages in the solver, such as infinite loops.

I don't think I'll have time to work on this feature in the near future, though I could try to answer questions or review code if anyone else wants to implement it.

Ericson2314 commented 5 years ago

@grayjay

Should we only relax cycle detection for tests and benchmarks for now?

I think the idea is that the cycles really and actually go away once we track on the level of components. No special-cases are needed.

I'm not sure what @phadej was thinking but I believe we need private dependencies for changes containers to not cause a rebuilding of the testing libraries using containers. Solving that without private dependencies would, however, be a special case for benchmarks and test suites.

grayjay commented 5 years ago

Should we only relax cycle detection for tests and benchmarks for now?

I think the idea is that the cycles really and actually go away once we track on the level of components. No special-cases are needed.

I meant that I was wondering whether there is a use case for other types of cycles now and whether we should disallow them in order to simplify install plans. We can always remove the restriction later, but it would be very hard to move in the opposite direction.

I'm not sure what @phadej was thinking but I believe we need private dependencies for changes containers to not cause a rebuilding of the testing libraries using containers. Solving that without private dependencies would, however, be a special case for benchmarks and test suites.

I'm not sure I understand. I think that private dependencies is a separate feature that doesn't require per-component dependency solving. The approach taken in #3422 is like a special case of private dependencies. I think that per-component solving would only help solve the test suite cycle issue if we used it to change the solver's cycle detection to use components instead of packages. It would cause rebuilds of the test framework, though.

Ericson2314 commented 5 years ago

I meant that I was wondering whether there is a use case for other types of cycles now and whether we should disallow them in order to simplify install plans. We can always remove the restriction later, but it would be very hard to move in the opposite direction.

I agree let's not overcommit to allowing things we will want to band later. I'm not fan of "Postel's law"'s over-application with these matters :).

The approach taken in #3422 is like a special case of private dependencies.

OK glad we agree on that.

I'm not sure I understand. I think that private dependencies is a separate feature that doesn't require per-component dependency solving.

Right if #3422 is a special case of cycles and private dependencies, I rather grow it into "private dependencies in general" rather than "cycles in general".

I think that per-component solving would only help solve the test suite cycle issue if we used it to change the solver's cycle detection to use components instead of packages. It would cause rebuilds of the test framework, though.

Totally agreed. @phadej's changes to containers (make a package under a different name in other cabal project) both avoid the cycles and avoid the extra test rebuilds. I wanted to point out that per-component therefore is not enough on its own to obviate the renamed package---need the private dependencies part to avoid the rebuilds too or there's a usability regression.

isovector commented 5 years ago

I just ran into this while trying to make polysemy depend on polysemy-plugin, but polysemy-plugin:test depend on polysemy. It's a frustrating limitation, and one that totally breaks the abstraction that these things form a graph. How much work is left to be done here? Could you use an engineering hand?

grayjay commented 5 years ago

@isovector None of the changes to cycle detection have been implemented yet, so we could definitely use a hand! I gave a summary of what I think needs to be done to only update the cycle detection in https://github.com/haskell/cabal/issues/4087#issuecomment-497244606. I'm not sure if it would require more changes outside of the solver, though.

haskell / cabal

Per-component dependency solving #4087

779, #3978, https://github.com/haskell/cabal/issues/3492#issuecomment-225391004 - solver doesn't reject configurations that require unbuildable/non-existent components

1575 - cabal can't handle cycles between packages that are not cycles between components

2725, #5413 - cabal requires dependencies of components that aren't being built

3662 - more per-component support

3263 - fine-grained dependencies

Summary

Tree structure

Variables

Index conversion

Exe/lib component transformation

Test/benchmark component transformation:

Dependency graph

Custom setup validation steps

Component dependency validation steps

Enabling tests/benchmarks

Internal dependencies

More questions

3923.