cps-org / cps

Common Package Specification — A cross-tool mechanism for locating software dependencies
https://cps-org.github.io/cps/
Other
91 stars 8 forks source link

`component::requires` and `package::requires` differences #38

Closed dcbaker closed 4 months ago

dcbaker commented 4 months ago

Currently component::requires is speced as string[], but package::requires is a mapping with a hint, version, and components.

This means at the package level we can place version requirements (which seem extremely) important, but at the component level we cannot. It also means that at the component level we can rely on the default_components of an external package, including appending to it by using two components (['foo', 'foo:bar']) but at the package level we must specify exactly what components are required.

mwoehlke commented 4 months ago

...yes? I don't see the problem here.

It is by design that you can't use package Foo's component A from e.g. Foo 1.2 and Foo's component B from e.g. Foo 2.7¹. You can use one and only one Foo, so naturally you specify package-level requirements (e.g. version) with the package. Similarly, yes, when you specify required components of a package, that is meant to be "complete"... although it's understood that at the package level you only need to specify components that a package only provides optionally. (That is, for many packages, you won't specify components as part of the package requirement at all.)

To be clear, the components attribute of a package requirement is not an exhaustive list of what components you will ultimately consume. Rather, it is similar to version in that tells the package search "if you find a candidate, but it doesn't meet these requirements, keep trying". Note also that it's intended for "components" in this context to not necessarily correspond to actual components that are consumed; they can also refer to optional features of the package. For example, I might want to specify that I need libcurl with SSL support, even though I never directly link to anything SSL-related (n.b. "symbolic" components).

(¹ Strictly speaking, you can if you use e.g. Foo2 as the canonical name of the "Foo 2.x" package. In fact there is quite some precedent for this sort of thing. But in general, the whole point of packages is that they consist of associated components that cannot be mixed with different versions/builds of the same package.)

dcbaker commented 4 months ago

To be clear, the components attribute of a package requirement is not an exhaustive list of what components you will ultimately consume. Rather, it is similar to version in that tells the package search "if you find a candidate, but it doesn't meet these requirements, keep trying". Note also that it's intended for "components" in this context to not necessarily correspond to actual components that are consumed; they can also refer to optional features of the package. For example, I might want to specify that I need libcurl with SSL support, even though I never directly link to anything SSL-related (n.b. "symbolic" components).

This completely unclarifies what's going on. The description in the schema leads me to believe the following: package::requires == "requirements which apply to all components in this CPS file" component::requires == "requirements which apply only to this component"

What I'm reading in what you're saying is: package::requires == "restrictions which will later be placed on components"

Am I understanding what you're saying correctly?

mwoehlke commented 4 months ago

package::requires == "requirements which apply to all components in this CPS file"

Well, that... doesn't sound right? A package-level requires says that "some parts of this package depend on this other package, with these additional stipulations". (There are even ways — using multiple .cps files, specifically — to allow a consumer to use parts of a package that don't have such requirement without needing to drag in the dependencies of those parts that do. "Normally" though you wouldn't bother with that granularity.)

It might help to think of this in passes. The first pass resolves all package-level dependencies. (In CMake parlance, this is "configure time" and directly relates to find_package.) This is your only chance to specify what incarnations of e.g. Bar are acceptable to Foo. (As a design, and sanity, decision, packages are not allowed to use multiple versions of a dependency.) If the tool can't find a suitable Bar, it can't use that Foo (but is welcome to keep looking for a different, usable Foo).

On completion of that stage, you've decided what (e.g.) Foo and Bar to use and you have a "library" of all components that the consuming project (and its dependencies) might use. Thus, when (in the second stage) the consumer tries to use e.g. foo:foo which has a component-requirement on bar:bar, you either already have that component or your build is broken. There is no reason at this point, however, to specify requirements on which Bar is acceptable; that happened in the first stage. The second stage is solely responsible for integrity checking ("do all requested components exist?") and translating components into build flags/commands. (In CMake parlance, this is "generate time".)

It's also not necessary for the first pass to exhaustively know all components, as the (optional!) component specification in the first pass just means (e.g.) "don't accept any Foo that doesn't provide foo:toast". Most users won't use this field. The consumer will still get whatever components the version of Foo that is found happens to provide. In most cases, consumers will only use default components, or components from a relatively stable set that all versions (subject to version requirements, anyway) of Foo are expected to provide. The intent for package-level component requirements is to model packages that are often distributed with optional functionality; curl with SSL, Qt with Widgets, etc..

Note also that nothing stops you from writing a component-level requirement for an optional component. It's just bad practice / bad QoL for your users, because you'll have the build tool telling you your build graph is broken rather than telling you it couldn't satisfy the dependency requirements. (Or, worse, rather than having a working build because it was able to keep looking.) OTOH, since transitive-dependency lists are almost always going to be machine-generated in practice, it may be that they are always present and exhaustive. (But that's a matter for the producer, not the consumer.)


To attempt to simplify: a component-requirement means "when consumed, also consume this". A package-requirement means "in order to use X, you must also have Y available, which must meet additional requirements A and B". Any component-requirement of X on Y must be satisfied by the same Y that the package-requirement supplied.

dcbaker commented 4 months ago

Okay, I think I understand how this is supposed to work. Let me give an example and work through how I understand this is supposed to work:

{
  "Name": "root",
   "Requires": {
       "foo": { "Version": "1.3", "Components": ["1", "2"]}
  },
  "Components": {
    "main": { "Requires": ["foo:1"]},
    "other": { "Requires": ["foo:2"]}
  },
  "Default-Components": ["main"]
}

In this case, when building the DAG, When the root package is loaded, the Requires section is parsed, the tool finds a CPS file fulfilling "foo", which must provide version "1.3", and components "1" and "2". Then with the "main" component of "root" is added to the graph (assuming "other is not used"), then we take the valid "foo" and depend on it's component 1.

Does that sound roughly right?

mwoehlke commented 4 months ago

Does that sound roughly right?

Roughly. I think some of the confusion may be because you're talking about a specific implementation that is different than how CMake does things. That's arguably a good thing! But it also means we're getting into details that the specification doesn't specify. That is, the specification says that certain conditions must result in certain outcomes, i.e. "from A, you must arrive at B", but we're at the level of asking how we get from A to B.

I would say, at a slightly higher level, someone asked you to find "root" (perhaps with version and/or component requirements). Your tool found the root.cps that you show. In order to know if it should use that root.cps, the tool must first also process any package-level requirements, so now the tool tries to find "foo" (fulfilling ...). If it succeeds, it can add both "root" and "foo" to the set of known components. (In CMake, "known components" aren't the build graph, and not all targets will necessarily appear in the build graph. Additionally, I would encourage not trying to resolve component-level requirements yet, because a broken component that's never used by the build can be ignored. However, I'm not saying immediate resolution is wrong. This would be a case of "you have an error, but because it doesn't directly impact what you want to do, I can look the other way".)

One important point is that, if (a suitable) "foo" isn't found, the tool must back out of whatever parsing it had done for "root" and keep looking for "root".

Consider a more complicated example where "root" depends on "foo" and "bar". I think you've already grasped one key point, which is that you don't try to add "root"'s components to the DAG until after you've found "foo". Where things might get tricky, though, is if you've found "foo", but then are unable to find "bar". The intent at this point is that you forget about the "foo" you found; you can't load this "root", but another might point you at a different "foo", and you might be able to load that one if you can use that "foo" and not the one you found earlier. Thus, you either need to process the full tree of package requirements first without actually processing any components until you're done, or you need to be able to roll back the DAG if chasing the requirement tree falls apart.

You are correct that, by the time you're resolving the component-level requirements of "root" (or e.g. "root:main" more specifically), the components of "foo" are already present wherever you're keeping components.

dcbaker commented 4 months ago

Okay. I think we're on the same page then.

mwoehlke commented 4 months ago

Great! Are any changes needed, then, or can this issue be closed?

dcbaker commented 4 months ago

I’ll go ahead and close it. I’m working on this for cps-config atm, so let’s see if I really got it right once those patches go out :)