Closed ezyang closed 5 years ago
I think that this is a good idea, and amazonka
presents a convincing use case. I agree that we shouldn't postpone 2.0 for this. Can you please list the differences between this proposal and #2176?
Another noteworthy example besides amazonka
is https://github.com/brendanhay/gogol
$ cabal list --simple-output | grep gogol | grep -F 0.1.1 | wc -l
105
My opinion on this hasn't changed since last time. I'm still opposed to this change. This will massively impact tooling across the board for minimal gains. I've seen no reason why the situation is different now.
@23Skidoo I updated the bottom of the ticket with the two major changes from the previous proposal.
@snoyberg Let me take some of your comments from the previous time around and say how things have changed with this version of the proposal.
GHC won't accept the separator without tweaking it, for instance
Our compatibility with new and old GHCs works due to the existing implementation of convenience libraries. The basic premise is that the library foo from package bar is munged into the package name z-bar-z-foo
in the package database. From GHC's perspective, we simplify identify a list of IPIDs whose modules should be brought into scope, otherwise GHC doesn't care if an entry in the package database is an actual package, or an internal library.
How is Hackage going to display these?
This proposal will require code from Hackage/Stackage to understand how to display multiple libraries from a single package. But this can come later; since many package maintainers want to maintain compatibility with old versions of Cabal, they will not want to migrate to this new syntax immediately. Once the window of supported Cabal versions is large enough and people start using it, Hackage and Stackage can gain knowledge how to support this.
How is permissions management for uploading going to work there?
Permissions management is, as before, per-package. No change here, no change necessary.
It's not some "convention," we've now completely broken invariants. Where is snap-server, or yesod-core, or pipes-bytestring, going to be located? Probably a dozen tools have hard-coded into them that they'll be located in snap-server.cabal, yesod-core.cabal, etc, located in the 00-index.tar file at a specific location. That invariant's gone.
The new proposal does not use the backwards-compatible hyphens syntax, so this is not an issue anymore.
cabal-install/Stack and dependency solving
Dependency solving is actually unchanged, because you can always interpret build-depends: amazonka:app <= 2.0
as build-depends: amazonka <= 2.0
and go from there.
Yes, this does mean that if you have a per-package build system, naively, you'll end up building every library inside amazonka
. But you had this problem for executables as well, and this was solved by figuring out which executables you actually wanted to build and passing them as arguments to the build
command; this same codepath can be reused for libraries as well. Or if you have per-component builds, you can just prune out libraries that you don't actually need: with per-component build in new-build, we don't really have to do any more work.
In any case, whatever we do, I'm not suggesting we start having people use it, I just want to make sure Setup.hs understands this so that, if later the ecosystem catches up, we can more easily flip the switch... or we can just as easily remove it again and call the experiment dead.
As the author of amazonka
and gogol
mentioned here - after reading through the proposal I'm not convinced I have a horse in this race, despite both projects adding 2+ new libraries every month or so.
I currently don't have an issue developing or building multiple packages thanks to cabal new-build
or stack
, and releasing is also trivially handled by tooling such as ye olde recursive make.
@ezyang Made the following statement about package usage:
A package is the mechanism for distribution, something that is ascribed a version, author, etc. amazonka is a tightly coupled series of libraries with a common author, and so it makes sense that they want to be distributed together.
While the above statement is true most of the time, regarding version ascription there are often small updates, bug-fixes, or release corrections that get pushed out as a minor version bump of the form A.B.C.*
for a subset of libraries, the top-level library, or the underlying single shared library. So I do require the possibility to release libraries under differing versions occasionally.
Since there's quite a lot of information floating around here what I'd like distilled is exactly how a user of my packages benefits, and how this would affect presentation of the library in Haddock, on Hackage, etc.
@brendanhay Definitely agreed about the need for individual version numbers.
While the above statement is true most of the time, regarding version ascription there are often small updates, bug-fixes, or release corrections that get pushed out as a minor version bump of the form A.B.C.* for a subset of libraries, the top-level library, or the underlying single shared library. So I do require the possibility to release libraries under differing versions occasionally.
I think the expected mode of use for multi-library package is that if you need to release a small bugfix for one library, you just do a new release for the entire package (with all the libraries). The biggest downside here is that when a user takes the update, a change to an edge package could cause the user to need to rebuild a core package; you'll do the same amount of rebuilding if you updated a core package, and these patch-level releases aren't supposed to be bounded against via the PVP, so dependency bounds aren't loosing expressivity.
how a user of my packages benefits, and how this would affect presentation of the library in Haddock, on Hackage, etc.
There's no patch here yet, but my thought is that https://hackage.haskell.org/package/amazonka will eventually have a table of contents for each supported library, containing the module listings for each library under a heading. Haddock shouldn't be much different than it is today.
I think multi-library packages are primarily an improvement for maintainers. Probably the biggest improvement from the user end is that all of the available libraries are in one place (as opposed to having to go to the category page), and that when doing dependency provenance auditing, amazonka can be treated as a single dep, rather than 90 packages that need auditing: it's easier to tell that one package comes from a single source, than a hundred packages that come from a single source.
It occurred to me today that #2832 is more or less essential for the wider adoption of this feature.
... and per component solving, TBH.
2.We add a new field to library stanzas, public , which indicates whether or not the library is available to be depended upon. By default, sub-libraries are NOT public.
Is public / private enough? Or is a test scope needed as well? See e.g. https://github.com/haskell/cabal/issues/4297
As a counterpoint to @snoyberg's views, I'd like to say that I'm heavily in favor of this change. Why?
I've been using backpack heavily in the design of some of my recent projects.
However, this currently necessitates, at least in one case, me splitting one library up into 2 dozen or more separate libraries:
All of which have their own versions and tedious metadata.
All of which will spam the package list separately on hackage.
All of which will have to be managed separately on hackage.
All of which will have to be deprecated individually when I make small updates that turn around and churn the overall hierarchy of packages. This means that every intermediate state will be visible.
In cabal 2.2 I'm getting common blocks, but I won't be able to make any meaningful use them if I want third parties to be able to instantiate some of those intermediate libraries, and I do.
With the ability to instantiate multiple externally visible libraries from a given package I can give you a single unpacked-containers
package such that you can instantiate unpacked-containers:set
monomorphically to your particular element type, rather than our already rather ridiculously fine-grained package management by the standards of most languages collapsing almost down to the single module level, making it almost impossible to find code on hackage.
Even without backpack I'm often pressed to split something like lens
up into several smaller packages. I'd be much more inclined to split it up into several smaller externally facing libraries in one package, with a final wrapper, as the separate components could be compiled with different, more minimal dependencies, without breaking up the workflow, and I can see a roadmap, using common block tricks, etc. maintain the current API without undue pain once cabal ~2.2 falls far enough away into the rearview mirror that it isn't a support headache. More importantly if the number of these fluctuates over time, I don't wind up with the huge package management headache that is the current norm with old packages cluttering the hackage
namespace forever. (In the split-up story, there is also a great deal of pain around where to place documentation, tests, package executable artifacts, dealing with a dozen copies of doctest suites cut-and-pasted across everything, etc.)
I already personally have a dozen or so old project names "tombstoning" up the hackage namespace from earlier package consolidations and refactorings. The fine-grained model backpack drives us to would push this ratio up, by a great deal over time.
Reading through this discussion, I'm not convinced. All in all, I don't see why this should be exposed on hackage or outside of the tooling that a developer has locally.
@ekmett I'm not convinced by some of your arguments.
All of which will spam the package list separately on hackage. All of which will have to be managed separately on hackage.
I don't know what specifically this is about, but most systems use tags
and/or a query language for doing operations on sets of objects. Cabal has categories
, but there could be categories not shown by hackage, or a separate tags
field for this. Combining author, categories, and tags should be enough to create most sets of objects for automation purposes. Iterating through this kind of stuff should be done using client-side tooling.
With the ability to instantiate multiple externally visible libraries from a given package I can give you a single unpacked-containers package such that you can instantiate unpacked-containers:set monomorphically to your particular element type, rather than our already rather ridiculously fine-grained package management by the standards of most languages collapsing almost down to the single module level, making it almost impossible to find code on hackage.
The first part of this argument might or might not make sense - I don't know backpack well enough.
The last part of this argument is really about how hackage has very low discoverability. Discoverability should not primarily be in the hands of package developers, and it shouldn't even be based on manual markup. Cabal is not for annotating graphs, but nodes. This argument looks like an argument for annotating some mini-graph in your package specification, but how about the larger graph in hackage? How can people find the alternative lens
libraries not authored by you? Should discovering an alternative lens
be using a mechanism different from the mechanism that groups related lens
-libraries you author? If so, why?
The larger graph annotation issue should be fixed before optimizing a local issue which might be subsumed by a solution to the larger problem.
The annotation layer should be on top of the packages. My point of view is that hackage should be calling out to a host of external services for this - a microservices based approach. Alternatively a standard graph representation, such as DOT-ish files could be loaded into hackage as layers.
It is possible that some graph annotation should be collapsed into the cabal file, but at a minimum that means being able to talk about external packages in this graph language.
@ezyang gives this argument:
I think multi-library packages are primarily an improvement for maintainers. Probably the biggest improvement from the user end is that all of the available libraries are in one place (as opposed to having to go to the category page), and that when doing dependency provenance auditing, amazonka can be treated as a single dep, rather than 90 packages that need auditing: it's easier to tell that one package comes from a single source, than a hundred packages that come from a single source.
I agree with this, but I think that just proves that the problem is elsewhere, namely that hackage has no annotation layer. Where is the [see other packages]
link next to Author, Homepage, repo etc? Those are really low hanging fruits. Shoehorning this into the cabal file and solving 10% of the problem only hides the real problem IMO. I don't think anyone is confused about the nature of amazonka
, but even if they were, it's hackage's job to display them as a group (which is easy given that they have the same home page, repository, and author!). No need to change the data model because of a rendering issue in hackage.
I don't see the provenance auditing argument. You can trust the author, the homepage, or the repository, all which represent different views on source. What does this grouping mechanism bring to the table? Maybe I'm misunderstanding what provenance auditing entails?
@alexanderkjeldaas: None of your points there address the central "churn" issue I raised above. Adding and deleting library components is a thing I can and frankly must do version by version because of how much backpack I'm using. Adding and deleting packages pollutes hackage forever.
To make this concrete
https://github.com/ekmett/coda/tree/66480f3e3ee4e6cd19bcb3a44b9c8cd698314901/lib
provides a pile of packages, almost 80% of which no user will or should ever see. 20% of which they should.
https://github.com/ekmett/coda/blob/62d0ca91778a7c0fa7a6cdce3a0ecc6f3ba30bfd/coda.cabal
on the other hand manages to package all of these things in the original pre-backpack API by using multiple libraries internal to the package. This drops a couple hundred lines of configuration duplication. Once we get common blocks I can consolidate even further. I give it one version, I'd maintain it as one package. If I add or remove components, which I'm doing daily, and will likely have to do long into the future, they won't separately churn the hackage package namespace after the package is released.
Unfortunately, it is useless. As I want users to be able to instantiate packages like the coda:dyck
or coda:group
or coda-set
. All of the rest of these packages exist to support these 3-4 "publicly instantiable" backpack modules and the final coda
executable.
I don't want 20 packages. I want ~4 visible libraries, but to get them I have to spam hackage with 20 packages.
The current situation leaves me hoist upon the horns of a dilemma:
Do I do the thing that makes me a better hackage citizen and ship one package with 20 libraries in it that nobody else can use except for the target executable at the end? Without multiple external facing libraries this is the most maintainable option on my end, but it ensures nobody else can build upon my work repurposing the internals of my language server support for another language. And, ultimately, if nobody can build upon my work why upload it?
Or do I break the package up into 20 separate packages all with their own test suites, etc. and go all in on automation, and manually deprecate fragments of the resulting package as those sub-packages get created and destroyed, which the incredibly fine-grained nature of using backpack more or less ensures will happen?
At the moment I am opting out of the problem by not uploading anything, which serves nobody.
@ekmett's arguments seem convincing to me, now we just need to find someone willing to take this on. Perhaps a potential HSoC project?
@alexanderkjeldaas Are these theoretical concerns or do you have "war stories" to share? If not, @ekmett "wins" just by virtue of being the maintainer of more Hackage packages than any other single Haskeller alive. (I think that's about right.)
On a much smaller scale I've felt a similar pain to ekmett's just maintaining a "generic" library across "in-memory-database", "postgres", etc.
It took me a minute to try to extract the key argument from @ekmett's comment, so I'm going to try to restate it as I understand it:
Backpack gives us modules but it does so by organizing things at the library level. Modules can be useful for parametrizing packages by other packages. But they can also be useful as a tool to organize abstractions in code, as an alternative to typeclasses with different tradeoffs and benefits. However, to take advantage of this second use-case, we need multiple libraries, since modules live at the library level.
So posit a package that uses modules to organize code. Now, it can use internal libraries that are not visible to anyone else, and the final package is useful. However, those internal libraries then are not importable or usable by others.
So fine -- you can now choose to expose those internal libraries so others can use them, by putting them each in their own package. But now, you've got a ton of packages, each taking up top level namespace. So the ease of refactoring you had when you only had one package with multiple internal libraries drops significantly. Adding or removing a typeclass between versions of a package shouldn't be a big deal. Adding or removing an internal library won't be a big deal. But adding or removing a top level package is a big deal.
So a rough analogy (only rough, because I think it captures the wrong parts of what are shared and parameterized over, etc.) would be a world where we could define whatever typeclasses we wanted, but everytime we wanted to share a typeclass with others, we needed to put it in a separate package.
So the argument as I see it is that packages should be able to provide multiple modules. And because modules are handled at the library level, then packages need to provide multiple libraries.
@gbaz is it a new package for each typeclass or a new package for each instance?
@ivan-m One package for the "class", one for each "instance" along with another to actually use the instance, as you can't define the modules it depends on in the same package that mixes in the backpack package, and test suite to check that it matches the signature. This yields a footprint like 2n+1 libraries + n test suites for n "instances".
@BardurArantsson I'm fine with @ekmett winning by default, but the rationale for the decision will be this issue and if the arguments aren't easy to understand, that's too bad.
@ekmett thanks, and with the help of @gbaz I now understand the churn issue better.
I think @snoyberg has a point that this is quite a change for benefits that probably shouldn't be requiring incompatible changes.
So, repeating the technical requirements in my words, it would be something like this:
IMO, if sub-libraries are allowed to have different versions, then the given proposal doesn't really handle the tombstone issue, as I cannot know whether foo:bar
exists or not by simply looking at the latest uploaded foo
object in hackage. So with independent versioning, it seems like the proposal is "porcelain" that should be directly translatable to existing concepts, like:
x-visibility: [private/public]
to handle visibility in hackage. I.e. all packages are still available, but some are hidden in the UI.@ekmett does this cover the churn issue?
@alexanderkjeldaas I think @ekmett wants to have same version number for all public sub-libraries.
I'm not seeking to "win" by fiat, but rather by strength of argumentation. ;)
@alexanderkjeldaas Your proposal leaves me uploading 20 separate packages, implementing doctest independently 20 times, unable to use common blocks to reduce boilerplate between them, and still doing a ton of manual maintenance.
Tombstoning is implicitly handled by hackage not showing private packages.
Except now its even worse as the names are still being taken, but aren't shown.
My personal estimation is that continuing the current policy will result in backpack being considered an all-but-unusable curiosity, when it can directly address a bunch of practical performance and cut-and-paste coding issues in Haskell if we just make some changes to the ecosystem to better accommodate it.
IMO, if sub-libraries are allowed to have different versions [...]
Cabal sub-libraries are inside the same package, there is only one version
line at the top of a package description. foo:bar
and foo:baz
would have the same version #, just like how multiple executables and test-suites in a package share a version with the libraries they ship with today.
We already have a similar dependency story for build tools, where you can depend on foo:exe:whatever
. This simply extends that policy to build-depends
.
@ekmett has pretty well summarized most of my thinking on the issue. I don't think it's practical to expect Backpack users to split and upload 20 packages, and I don't want this to happen. I am less bearish, in the sense that I think there are useful use-cases for Backpack even without multiple public libraries, but these are all "big scale" Backpack, and not the fine grained use that are easy for early adopters to play around with. Maybe this fine-grained use truly is the most useful thing; in that case I'd really like to see this proposal take wings :)
I'll add one more bit of information; by far the part of implementing this which is most opaque to me is how to adjust Hackage and Stackage to accommodate this new model, since it's entirely new UI that has to be designed and iterated on. There are some structural problems with making this happen, but I am not on a tenure clock and I can take the time necessary to make it happen.
As a nascent user of Backpack, I've already had to split one of my projects into ~6 different internal libraries, and common
stanzas already. Admittedly, my project is very small, less than 100 LOC... but I think that sort of proves just what kind of granularity backpack will force on us (for the better, ultimately).
Any non-trivial project will almost certainly see its library count explode by a few factors. This is unavoidable, as far as I can tell. So I am strongly in favor of this, even if migration time is slow. As Ed K mentioned, this is effectively much like extending build-tool-depends
to build-depends
, which I think is a solid and exciting idea.
And while the proposal explicitly states it stands on its own -- at this rate, no matter what way you slice it, I think this feature is going to be crucial for larger Backpack adoption, which is my primary motivation, anyway. Discussions about hackage UX or whatnot aren't as relevant to me at all, admittedly, because I don't use hackage for "discoverability", and even if I did -- even if the discoverability can be improved in 2 dozen ways -- it doesn't take away from the core complaint "I now have to upload and maintain ~15x as many .cabal files" when I use Backpack. I think discoverability should be better, I guess, and this could impact it, but it doesn't solve the same problems. You can ultimately pile a lot of stuff into Hackage to make this appear nice, but it'll be tough to convince me that's a better solution than handling it natively in the build tool.
As it stands? I honestly can't consider myself taking the time to break up my experiments, all my signatures and whatnot, into mini-.cabal files, each exactly in sync, and play the upload-dance 6 times in a row every time I make minor changes.
I support this change. Making it is crucial for using Backpack to its full potential (as a partial replacement for type classes). My experience mirrors that of @thoughtpolice — I have a small library (about 300 LOC) and I had to split it into 4 packages.
I rest my (likely faulty) arguments.
On Dec 23, 2017 14:09, "Vladislav Zavialov" notifications@github.com wrote:
I support this change. Making it is crucial for using Backpack to its full potential (as a partial replacement for type classes). My experience mirrors that of @thoughtpolice https://github.com/thoughtpolice — I have a small library (about 300 LOC) and I had to split it into 4 packages.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/haskell/cabal/issues/4206#issuecomment-353725442, or mute the thread https://github.com/notifications/unsubscribe-auth/AAUtqTdp2Z91aTU4ZKzdISVQRYB91d1jks5tDPuhgaJpZM4LaC38 .
@ezyang Hey I was planning to write a proposal to submit to GSoC for this topic. Is this the best place to discuss my proposal's ideas?
@sasma Sure, go ahead.
What's the status on this? Was a GSoC or similar ever started? It would be a shame to let this languish, since I think it's pretty important.
@fgaz is currently working on this as a GSoC project.
I would also like to see this proposal implemented. Although not as drastic as the amazonka
package,
I had to split the package raaz into multiple directories and
tie it together with a cabal.project file. Some of the annoyance of this is
Unable to use a default section in the cabal file
I had to create a public package raaz-core-indef
just to expose the public signatures.
I could not add it to raaz-core
as instantiating them will lead to recursion and backpack will
does not support it (backpack if I understand correctly checks for recursion at the package level).
These are real annoyance when using backpack in anger.
acid-state/acid-state#99 is another use case for this (without Backpack, at least for now).
@fgaz I need one clarification on this. When a single package exposes multiple public libraries, it should be possible to control which all are exposed right ?
@piyush-kurur There will be a visibility: public | private
field in the library stanza
Btw I settled on the amazonka:{appstream,elb} >= 2.0
syntax. The main unnamed library can be added by using the same name as the package (pkg:{pkg,sublib1,sublib2}
).
So
name: pkg
Library
...
Library pkg
...
Is invalid. Is it now already?
Sent from my iPhone
On 21 Sep 2018, at 10.09, Francesco Gazzetta notifications@github.com wrote:
Btw I settled on the amazonka:{appstream,elb} >= 2.0 syntax. The main library can be added by using the same name as the package (pkg:{pkg,sublib1,sublib2}).
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.
@phadej Sorry, what do you mean? The stanza naming remains the same (ie empty library name for the main library)
Oh now I understand... I guess it is invalid already because of how internal deps shadow packages but I'll have to check. The other option was to use a reserved word (lib) to represent the main library
@phadej fwiw, naming the sub-library the same way as the package name has been invalid so far (i.e. cabal rejected it in the past already); and this is also documented in the user's guide accordingly
Can we extend build-tool-depends
to be consistent?
@Ericson2314 yes, definitely; I already talked to @fgaz about this
So I'd expect build-tool-depends
(& a future run-tool-depends
) and build-depends
to support the very same syntax; with the only difference that build-tool-depends
refers to components in the exe:*
namespace, whereas build-depends
refers to components in the lib:*
namespace.
Moreover, I expect that foo:{a,b} == 1.2.*
is equivalent to foo:a == 1.2.*, foo:b == 1.2.*
(and I've been told that multiple occurences of the same package-name are merged into a single one as you'd expect via conjunction, consistent with how the rest works; so e.g. foo:a >= 1.2, foo:b < 1.3
would be equivalent to foo:{a,b} == 1.2.*
)
Consequently, the {}
s are considered optional when specifying a single sub-lib, i.e. foo:{a}
vs. foo:a
.
Ok. Perfect!
@hvr What is the proposal to handle a combination foo:a == 1.2.*
foo:b=1.3.*
. Since these should be
coming from the same package, I guess this would lead to no install plan.
@piyush-kurur yes, I'd expect that to be a contradiction and hence unsatisfiable; cabal should be able to point out such cases as suspicious warnings. I don't think we need the added complexity of supporting mixing sub-libs from different releases of the same package they belong to, do we?
@hvr I too would say that cabal should not install the two. Things can get tricky if the version bounds give
an non-trivial intersection but not exact. This can happen where packages evolve (I started by using only the foo:a
and then realised I wanted foo:b
). But cabal-install should give a solid warning here.
@piyush-kurur @hvr
Currently foo:a == 1.2.*, foo:b=1.3.*
is unsatisfiable (as it should be, imo).
@Ericson2314 @hvr Executables are different though, in that I can use two exes from different build plans without them conflicting (deps are not exposed and an exe's interface is weakly typed). So maybe we should only support one single tool dependency per line (so no {}) or at least disable build-tool-depends union
I can use two exes from different build plans
...but should you? :-)
Is there a legitimate use-case which requires to mix executables provided by the same package from different releases? I'm suggesting to opt for the simpler design unless we can demonstrate that there's a legitimate use-case which justifies incurring the added complexity, both technical in the implementation as well as cognitive (since you'd have an additional degree of freedom to keep in mind when specifying your package dependency spec)
Also note that if we go for the simpler scheme now, and in 1-2 years discover a relevant use-case, we can still change the semantics as we have the cabal-version:x.y
declaration which allows us to change the syntactic and semantic interpretation of package descriptions in a principled way without breaking compatibility.
Also, for example, when Setup.hs
is finally made a legitimate component, it's imperative that the needed component and its custom setup come from the same package.
@fgaz and @ezyang raaz now has a complete backpackised design so I am betting my money here. Is there something that I need to know to get the documention to display properly. I have a candidate package for raaz http://hackage.haskell.org/package/raaz-0.3.0/candidate for which the documentation is almost non-existent as almost all the haddock annotations have now moved to the signatures.
@piyush-kurur cabal new-haddock does generate correct docs for sublibraries, but we don't have an appropriate index page linking to them yet. Haddock and Hackage will have to be modified accordingly, with a subsection or table row for each sublibrary
@fgaz, it probably is slightly more complicated than that. Consider the case when you want to expose a indef package as a local component as well as its instantiated version from the top level. A concrete case for this is the Hash module from raaz:hash-indef and its mixed in version Raaz.Hash that is exposed from the top level raaz library (See issue raaz-crypto/raaz#379). For a user who does not care about the indef packages, the haddock documentations should be visible from Raaz.Hash where as as a developer I would want the api docs in Hash module in the indef packages because of two reasons
So I think there is some issue here that needs discussion. If this is not the right ticket for it I can open another one.
Motivation. A common pattern with large scale Haskell projects is to have a large number of tighty-coupled packages that are released in lockstep. One notable example is amazonka; as pointed out in https://github.com/haskell/cabal/issues/4155#issuecomment-270126748 every release involves the lockstep release of 89 packages. Here, the tension between the two uses of packages are clearly on display:
A package is a unit of code, that can be built independently. amazonka is split into lots of small packages instead of one monolithic package so that end-users can pick and choose what code they actually depend on, rather than bringing one gigantic, mega-library as a dependency of the library.
A package is the mechanism for distribution, something that is ascribed a version, author, etc. amazonka is a tightly coupled series of libraries with a common author, and so it makes sense that they want to be distributed together.
The concerns of (1) have overriden the concerns of (2): amazonka is split into small packages which is nice for end-users, but means that the package maintainer needs to upload 89 packages whenever they need to do a new version.
The way to solve this problem is to split apart (1) and (2) into different units. The package should remain the mechanism for distribution, but a package itself should contain multiple libraries, which are independent units of code that can be built separately.
In the Cabal 1.25 cycle, we've added two features which have steadily moved in the direction of making multiple public libraries possible:
Convenience libraries (#269) mean that we already have Cabal-file level syntax support for defining multiple libraries. Note that these libraries are only accessible inside the package, so to a certain extent, the only thing that would need to be changed is making it possible to refer to these libraries.
Per-component build (#3064) makes it easy to build each internal library of a package separately, without having to reconfigure a package and then rebuild. This means that, from the perspective of new-build, building multiple libraries from a package is "just as separate" as building multiple packages, and I imagine Stack would be interested in taking advantage of this Setup.hs feature.
So the time is ripe for multiple public libraries.
Proposal. First off, I want to say that for the 2.0 release cycle, I do not think we should add support for the feature below. However, what I do want to do is make sure that the design for convenience libraries (which is new) is forwards compatible with this (see also #4155)
We propose the following syntactic extensions to a Cabal file:
build-depends
shall accept the formpkgname:libname
whereeverpkgname
was previously accepted. Thus, the following syntax is now supported:amazonka
refers to the "public" library of amazonka (the contents of thelibrary
stanza with no name), whileamazonka:appstream
refers tolibrary appstream
insideamazonka
. A version range associated with sub-library dependency is a version constraint on the package containing that dependency; e.g.,amazonka:appstream >= 2.0
will force us to pick a version of theamazonka
package that is greater than or equal to 2.0.library
stanzas,public
, which indicates whether or not the library is available to be depended upon. By default, sub-libraries are NOT public.NEXT, we need the following modifications to the
Setup.hs
interface:The
--dependency
flag previously tookpkgname=componentid
; we now augment this to accept strings of the formpkgname:libname=componentid
, specifying what component should be used for thelibname
ofpkgname
.Explanation. The primary problem is determining a syntax and semantics for dependencies on sub-libraries of a package. This is actually a bit of a tricky problem, because the
build-depends
field historically serves two purposes: (1) it specifies what libraries are brought into scope, and (2) it specifies version constraints on the packages that we want to bring in. The obvious syntax (using a colon separator between package name and library name) is something like this:But we now have to consider: what is the semantics of a version-range applied to one of these internal libraries? E.g., as in:
Does having separate version ranges even make sense? Because the point of putting all libraries in the same package is to ensure that they are tightly coupled, it doesn't make sense to consider the version range on a library; only on a package. So the
build-depends
above should be considered as levying the combined constraint>= 2.0 && >= 3.0
to theamazonka
package as a whole.This causes a "syntax" problem where, if you want to depend only on sub-libraries of a package, there is no obvious place to put the version bound on the entire package itself. One way to solve this problem is to add support for the following syntax
amazonka:{appstream,elb} >= 2.0
; now there is an obvious place to put the version range.Downsides. Prior to solving #3732, there will be some loss of expressivity if a number of packages are combined into a single package: cabal-install's dependency solver will solve for the dependencies of ALL the libraries (even if you're only actually interested in using some of them.) This is because the solver always solves for all the components of a package, whereas it won't solve for the dependencies of a package that you don't depend on.
Prior art. This is a redux of https://github.com/haskell/cabal/issues/2716 Here is what has changed since then:
The motivation has been substantially improved; last time the motivation involved some Backpack/internal code readability hand-waving; now we specifically identify some existing, lockstep packages which would benefit from this. The proposal has nothing to do with Backpack and stands alone.
The previous proposal suggested use of dashes for namespace separation; this proposal uses colons, and the fact that the package must be explicitly specified in
build-depends
means that it is easy to translate a new-stylebuild-depends
into an old-style list ofDependency
, which means tooling keeps working.The previous proposal attempted to be backwards compatible. This proposal is not: you'll need a sufficiently recent version of Cabal library to work with it.
CC @mgsloan, @snoyberg, @hvr, @Ericson2314, @23Skidoo, @dcoutts, @edsko