cps-org / cps

Common Package Specification — A cross-tool mechanism for locating software dependencies
https://cps-org.github.io/cps/
Other
91 stars 8 forks source link

Supporting multiple `Location`s for a single configuration #29

Closed brauliovaldivielso closed 3 months ago

brauliovaldivielso commented 6 months ago

I have been looking into transforming pkg-config files into CPS files. One mismatch I've found is that the Libs field in a .pc file may contain several different libs, whereas the Location field in a CPS component may point to just one path.

You can find one example of such a .pc file in https://github.com/apache/activemq-cpp/blob/master/activemq-cpp/activemq-cpp.pc.in#L27 , where (besides pthread and libdl), the pc file is linking both the activemq library and libuuid. You could perhaps argue that activemq should use their Requires to specify a dependency on libuuid but it doesn't.

I can think of other cases where one could want to have several libraries in your .pc Libs without creating separate .pc files and using Requires to define the dependencies between them.

Would it be possible to make Location an array? Or perhaps add a Extra-Locations field? I know IMPORTED_LOCATION in CMake (which corresponds very closely to Location in CPS) also only supports one path, but maybe a similar new property can be added to CMake... Otherwise, what I had to do to model these multilibrary .pc files as CPS files was to pass the paths to the extra locations as Link-Flags, but that attribute is discouraged (and reasonably so).

brauliovaldivielso commented 6 months ago

Another open-source library I can find that puts multiple libraries in their Libs is icu. I see the following in the /usr/lib/x86_64-linux-gnu/pkgconfig/icu-uc.pc in https://packages.ubuntu.com/focal/amd64/libicu-dev/filelist

Libs: -L${libdir} -licuuc -licudata
autoantwort commented 6 months ago

I can think of other cases where one could want to have several libraries in your .pc Libs without creating separate .pc files and using Requires to define the dependencies between them.

You don't have to create separate cps files, you can have just one file with multiple components (each component represents 0..1 artifacts). Is there an downside of using two components for the two libraries?

brauliovaldivielso commented 6 months ago

Yeah I guess you're right. I don't have any hard argument for why multiple locations are needed, but I have the following soft argument:

You could consider a component in a CPS file as part of the "public interface" of the CPS file. In that sense, the maintainer of a library might not want to create a separate component (that consumers of the CPS can name) to represent an archive that they want their main component to link against. For instance, maybe the maintainers of icu-uc don't want to create a separate icudata component, because that would let their users specify icu:icudata in their Requires, whereas the maintainers may just want icudata to be an implementation detail of icu:icu-uc. (I barely know anything about the icu library so this probably doesn't reflect the maintainer's actual intent, it's just an example).

I guess I'm making this request because it'd make my life a bit easier when transforming .pc files into CPS files. I can workaround the issue by creating the extra components that you mention. It will involve more work on my side but may be worth it anyway.

What are the main advantages of having a single Location in your view? Because if there are none, making the spec a bit more flexible (supporting multiple Locations) might make adoption easier for people in situations that are similar to mine.

bretbrownjr commented 6 months ago

I'm flexible on how to sort out the use cases here. I think specific examples like icu are helpful, so let's make sure we call those out when we see them.

Another use case I've seen is for multiple libraries in a Libs field in *.pc files is to force links on system libraries (including special *.o files) that don't have pkg-config metadata for whatever reason. I expect use cases for "what about my dependencies that don't have cps data yet?" will be required to be supported.

Is that messy and possibly error prone? Yep. We should develop some docs and, if we can, tools around what works, what doesn't, and what's a hack to buy time for packaging cleanup (like providing CPS data for system-installed zlib or whatever).

Anyway, I think the "visibility" of a CPS entity could be interesting to explore. I could see there being use cases for "private" and "package protected" (in the Java sense) entities. I don't want to overcomplicate design and educational materials now if we can avoid it though. My instinct would be to keep it as simple as possible for now and add complexity in the face of compelling adoption hurdles that can't be otherwise worked around. Possibly the icu use case would qualify, as that's an important library in practice.

autoantwort commented 6 months ago

Anyway, I think the "visibility" of a CPS entity could be interesting to explore.

+1. I also wanted to propose something similar. You should be able to mark components as internal so that they are not directly consumable. This could be helpful for common settings or libs that only contains data. Afaik this is also done often in cmake with targets.

mwoehlke commented 5 months ago

What are the main advantages of having a single Location in your view?

...it doesn't require redesigning CMake? :slightly_smiling_face: Note that this largely stems from components being equivalent to targets in the build system, so I'm not convinced it's a non-sensible distinction.

What's the disadvantage to having one component per library? Would allowing "private" components mitigate the concern of users linking to something not intended to be visible to consumers? Offhand, component visibility seems like a reasonable addition.

You could perhaps argue that activemq should use their Requires to specify a dependency on libuuid but it doesn't.

There's no "perhaps" about it; libuuid is an external dependency and should be expressed as such.

mwoehlke commented 5 months ago

I'm going to tag this "need feedback". If we agree that component visibility is the right / satisfactory fix, we should either "rebrand" this issue or close it in favor of a new one (I'd lean toward the latter). Otherwise, let's keep discussing and figure out what we do want :slightly_smiling_face:.

brauliovaldivielso commented 5 months ago

...it doesn't require redesigning CMake? 🙂 Note that this largely stems from components being equivalent to targets in the build system, so I'm not convinced it's a non-sensible distinction.

Are you saying that this would require a redesign of CMake because IMPORTED_LOCATION only supports one path?

You of course know more CMake than I ever will, but changing CMake so that IMPORTED_LOCATION can support a list of paths doesn't superficially sound like it would require a redesign. But maybe there's something I'm missing so do correct me if I'm wrong.

What's the disadvantage to having one component per library?

(Note that I'm not coming to this problem as a library author that wants to have more than one location for my library. I agree with you that ideally libraries are architected such that they only need one Location. I'm coming at this problem from the perspective of someone who wants to mass-transform existing .pc files to CPS files, and noting the mismatches I'm finding to hopefully make future adoption simpler.).

It feels a bit boilerplatey to add a new component just for adding another one archive to your link line. Supporting multiple locations felt to me like the lean way to cover this use-case. Conan's package representation also supports multiple "locations" (the libs field). I could look into what packages in conancenter use more than one such lib, as a way to gauge how common this need will be.

Would you consider supporting multiple locations if it was widespread for packages in conancenter to specify multiple libs?

If I have a component with 3 libs : ["libA.a", "libB.a", "libC.a"], and I want to create a single-location CPS, I'll have to create two "anonymous" (or "private") components and have them depend "sequentially" on each other to ensure the right link line

{ 
  "A": { "Location": "libA.a", "Requires": ["B-private"] },
  "B-private": { "Location": "libB.a", "Requires": ["C-private"] },
  "C-private": {"Location": "libC.a" }
}

Component visibility fees like a more complicated addition (conceptually) to the specification. I acknoweldge it would solve my problem (albeit in a more verbose way). Of course, not opposed to it if it solves other problems that library authors/packagers want to solve wrt how to structure their packages.

bretbrownjr commented 5 months ago

I'm currently leaning towards keeping the design simpler for now and expecting a unique location per library. It is possible to model libraries with no source interfaces (headers, modules, etc.) and with link requirements.

One reason for this is that I have seen projects adding, for instance, -lfoobar to their pkg-config metadata for adding a "private" dependency. However, that complicates the model for the build graph. Build systems and other consumers of this metadata now have a missing edge in the build graph, so they can't accurately derive link lines, detect circular header search paths, etc.

If foobar were instead a declared dependency modeled as a library with no source code interface, build systems could incorporate that properly in the build graph, including tracking timestamps and/or checksums of these files as needed.

If it helps, I can draw some diagrams somehow. Someone let me know if that would be helpful.

If we take the approach I'm suggesting, we should add some docs about this reasoning and some examples of what is preferred.

I'm willing to concede that there may be a major adoption hurdle resulting from a simpler model. If that happens or we feel like that's already the case, let's be clear about that. Adoptability is important to me. Some things can be messy but emit warnings or something.

brauliovaldivielso commented 5 months ago

I generally agree with keeping the specification simple, and supporting single Locations is simpler than not. I'm happy leaving this issue parked, and only acting on it if/when other people from the community speak up. If people don't speak up, then it must be that adoption is not that much of a hurdle, and so keeping the "purity" of single-Location CPS files will be worth it. But we'll see - kind of suspicious that conan would support multiple libs (their analogous of Locations) if it's not needed.

If foobar were instead a declared dependency modeled as a library with no source code interface, build systems could incorporate that properly in the build graph, including tracking timestamps and/or checksums of these files as needed.

In a world where multiple Locations are supported, build systems would be able to model the build graph, track timestamps and/or checksums just fine? I agree that it could be trickier when people smuggle -lfoobar flags in their pkg-config for declaring private dependencies, but I don't see the connection between that and multiple CPS locations.

mwoehlke commented 3 months ago

In the most recent Ecosystem Evolution meeting, we decided we want component visibility, regardless. We should open an issue to that effect and revisit whether that is an acceptable resolution to this.

mwoehlke commented 3 months ago

Based on the most recent comments, I believe this can be closed (at least for now). We retain #53 as a relevant action-item; if that isn't sufficient, we can revisit.