Closed joneshf closed 4 years ago
@joneshf I think it depends on the definition of monorepo that we have: in my experience I have only encountered monorepos as "final consumers" of other packages. You might have the case in which some of the monorepo is also useful as a separate package, so to publish that you have two choices:
It sounds like you're looking for (2), which at first sight makes sense to me. Thinking more about it though, it looks like the rest of the ecosystem is not compatible with this setup, e.g.:
bower
doesn't have support for packages not "on the root" of a repopulp
will want to create tags in your git repo, so this means that you'd need to nest git repos, at which point you're using submodulesI would be wary of supporting another way of fetching packages without coordination with other tooling (because introducing splits, etc etc), so I guess the most usable way to do this would be to use submodules, at the price of not being able to share the top-level packages.dhall
(i.e. the package becomes effectively standalone)
Yep, that's exactly what I'm thinking. I appreciate your position and I have a few thoughts:
I'm willing to put all my eggs in the spago
basket.
I think this is a common sentiment among people in the community. The UX behind spago
is one of the better experiences in package management–even beyond PS package management. Getting started is easy, figuring out the next steps is easy, asking for help is easy, asking for new features is easy. This is one of the few projects in the PS ecosystem where I would actually point beginners to and not feel bad about it.
I think many people would be happy if spago
became the de facto package management solution for PS. Which is to say, I don't think people are still using bower
and pulp
by choice. I think we're still using it because inertia. The problems bower
and pulp
solve could be solved by spago
in ways that spago
wants to solve them and I think everyone would be happier for it.
bower
supports pluggable resolvers.
There could be a monorepo resolver that downloaded a git repo, and picked out the code from a specific directory.
What's the motivation behind the concern for pulp
?
If it's about being able to publish documentation on pursuit
, you don't have to use pulp
at all. I.e. the extra requirements pulp
imposes are just that, extra requirements.
If it's about following SemVer, v4.0.0-prelude
is a valid SemVer tag.
I appreciate that you're thinking about these things, and want to make sure things work out okay. But, I'd like to live in the spago
-only world. As a user of spago
, I'm not really interested in being able to consume packages in a monorepo from bower
or pulp
. If I need that functionality, then I'll do like you suggested and split out another repo to allow that.
I'd like to be able to build out a monorepo using spago
and then allow others to consume it using spago
. The way I structure the source code of a PS project should be transparent to those consuming that PS project. If it means I have to go do work to make psc-package
/purs
/pursuit
/etc. not explicitly tied to bower
in order for spago
to want to support this workflow, I'll do that.
On the subject of consuming monorepos, it is a thing that happens in many other ecosystems: babel, go, nixpkgs, rails, symfony, wai. Some of these ship end products (like gofmt), but they all allow you to consume the components of the monorepo. How that happens depends on the ecosystem. Most of them work by distributing an artifact that's separate from the file structure of the source code. E.g. wai's monorepo works because you can distribute a tarball that encapsulates only the important parts of warp.
In the PS ecosystem, we've so far coupled what a distributed package looks like with what the source code of that package looks like. It's a sensible model, and it's gotten us far, but it's an arbitrary coupling. If spago
allows referencing a tarball, zip file, or any other archive on the internet (rather than a git repo), I could work with that as well. Bundling up a bunch of files and deploying them somewhere is worlds easier than maintaining a ton of git repos. Is that a viable approach?
@joneshf thanks for the kind words! 😊
If spago allows referencing a tarball, zip file, or any other archive on the internet (rather than a git repo), I could work with that as well. Bundling up a bunch of files and deploying them somewhere is worlds easier than maintaining a ton of git repos. Is that a viable approach?
In principle this would not be too hard (the "arbitrary coupling" is there for ease of implementation), some observations:
git
accepts as remote" in the repo : Text
field of a Package
, but if we'd have to support anything different than git
, then we'd need to think of a way to distinguish all the different formats (a Dhall union is about perfect, but will require changing the type of the upstream package set), support their transport/unpacking, etcbower supports pluggable resolvers
Oh, TIL. If someone puts together such a resolver then I'd be fine investigating this further
What's the motivation behind the concern for pulp?
It's more like a general feeling about our package distribution being extremely tied to package == git repo
. I'm concerned about pulp
because it's the "reference implementation" on "how to officially publish stuff in PureScript", so anything that deviates from that has me concerned about breaking the workflow and making a Bower → spago
migration harder.
Though with the tag you proposed it should be fine. So the only thing left is to find a way to communicate to spago
the "source paths" inside the package right? Do you have any proposals about that?
- a Dhall union is about perfect, but will require changing the type of the upstream package set
Sorry, I'm not too familiar with the interplay/architecture. Why is that?
- you'd have to give up on adding your package to the upstream package set, since the current requirement is that "the package is available on $officialPackagesRegistry" (Bower at the moment) and follows
I'm totally fine with that. None of my packages are in the package set anymore anyway. I'm fine living in a spago
-only world.
- ...unless we get a registry of our own for PureScript in which we can do all the things, as discussed in this thread on Discourse, on which I'd love your input
I'm not too interested in being part of that discussion. A package registry is basically something you throw money at: pay for servers/storage, pay someone to build an API around it with authentication/authorization (or do it yourself), and step away. I recognize that it's not that cut and dry, but it's also not orders of magnitude more complex.
I've tried many times in the past to dump money into an official PS account/foundation/whatever in hopes that we could solve problems like this. Each time the response was that money wasn't going to be accepted in any official regard. I think we're wasting useful energy trying to figure out another way around it and I don't think it's fruitful for me to say much more. If someone comes up with a solution, I'll try to support it. But, I don't want to spend a bunch of time on it.
I'm concerned about
pulp
because it's the "reference implementation" on "how to officially publish stuff in PureScript", so anything that deviates from that has me concerned about breaking the workflow and making aBower → spago
migration harder.
Would you feel differently if the official way to work with PS was moved from pulp
to spago
? I'm down to be a hype man.
So the only thing left is to find a way to communicate to
spago
the "source paths" inside the package right? Do you have any proposals about that?
Can we add a source
key to packages–akin to #173?
- a Dhall union is about perfect, but will require changing the type of the upstream package set
Sorry, I'm not too familiar with the interplay/architecture. Why is that?
Quoting from here, this is the handwavy Dhall type of the spago configuration:
-- The basic building block is a Package:
let Package =
{ dependencies : List Text -- the list of dependencies of the Package
, repo = Text -- the address of the git repo the Package is at
, version = Text -- git tag
}
-- The type of `packages.dhall` is a Record from a PackageName to a Package
-- We're kind of stretching Dhall syntax here when defining this, but let's
-- say that its type is something like this:
let PackageSet =
{ console : Package
, effect : Package
... -- and so on, for all the packages in the package-set
}
-- The type of the `spago.dhall` configuration is then the following:
let Config =
{ name : Text -- the name of our project
, dependencies : List Text -- the list of dependencies of our app
, sources : List Text -- the list of globs for the paths to always include in the build
, packages : PackageSet -- this is the type we just defined above
}
Right now repo
is a Text
containing a URL that we parse to decide if it's local or not. This will probably be changing soon but leaving that aside for a moment, if we were to decide to support other kinds of packages than "git repos on the root", we could change from Text
to something like < GitRepo : Text | MonorepoView : Text | ... >
Now, as you can see above the PackageSet
is a record that has values of type Package
, so if we change the type here we'd need to change the upstream too. It's not a big deal and can be done in a backwards compatible way (though we'll take advantage of 1.0
to break this kind of stuff)
A package registry is basically something you throw money at: pay for servers/storage, pay someone to build an API around it with authentication/authorization (or do it yourself), and step away
At some point in the thread I propose to use the nixpkgs
model: using a GitHub repo (possibly mirrored) for metadata, and something like S3 (also mirrored) for package uploads storage. This means that:
Would you feel differently if the official way to work with PS was moved from pulp to spago?
We'd stop creating new bower users, but we'd still need to worry about the bower → spago
migration for the existing ones 🙂
Can we add a source key to packages–akin to #173?
I think we'd still need some kind of manifest in the root of the repo or in some other default location right? (so that spago would be able to locate a file to read the sources
key)
Oh. Because spago
uses psc-package
's types? Gotcha.
I think we'd still need some kind of manifest in the root of the repo or in some other default location right? (so that spago would be able to locate a file to read the
sources
key)
Sorry, I'm not sure I understand what we're talking about. Lemme explain what I'm suggesting more explicitly and you can tell me where I'm going wrong :slightly_smiling_face:.
Let's say I want to move https://github.com/joneshf/purescript-httpure-middleware into https://github.com/joneshf/open-source. There's really only one file in that package. I'd like it to sit at:
.
└── packages
└── purescript-httpure-middleware
└── src
└── HTTPure
└── Middleware.purs
Now, let's say someone else wants to consume purescript-httpure-middleware
. It would be nice if they could add to their packages.dhall
:
let additions =
{ httpure-middleware =
{ dependencies =
[ "purescript-httpure" ]
, repo =
"https://github.com/joneshf/open-source.git"
, sources =
[ "packages/purescript-httpure-middleware/src/**/*.purs" ]
, version =
"d03884217eed3f2d41205ac0a56573b2a1443107"
}
}
...
Instead of spago
assuming source files would be in src/**/*.purs
, it would use what the package defines–in this case packages/purescript-httpure-middleware/src/**/*.purs
. From my understanding of how the pieces fit together, it seems like this change would be as unobtrusive as possible while also providing a solution to consuming monorepos.
spago
could continue to work with git repos, so the core logic wouldn't change. If you didn't want diverging configurations (one package has sources
another does not), migrations could happen with something like:
λ(package : Package) → { sources = [ "src/**/*.purs" ] } ⫽ package
while also not requiring (but still allowing) a change in the upstream.
I think having written it out, what I'm really asking for is some way to override https://github.com/spacchetti/spago/blob/40551e3765ffd07637e7afabd2a169238626ace0/src/Spago/Packages.hs#L87
This isn't even specific to monorepos anymore. We've always assumed people would put PS source code at the top-level in a src
directory. It seems like if spago
can relax that constraint, it could not only make packaging more robust (because we're no longer making an assumption about where source code lives), but it would also allow monorepos to be consumed.
Does this re-phrasing change any of the stuff we've discussed so far? In other words, if we only think about improving robustness of packaging–and don't think about the fact that it extends spago
to support consuming monorepos–would the change above be acceptable?
Oh. Because spago uses psc-package's types?
Not really, but kind of: because it uses types from package-sets
, when importing the packages.dhall
from there
Instead of spago assuming source files would be in
src/**/*.purs
, it would use what the package defines–in this casepackages/purescript-httpure-middleware/src/**/*.purs
. From my understanding of how the pieces fit together, it seems like this change would be as unobtrusive as possible while also providing a solution to consuming monorepos.
Oh right sorry, now it makes sense, thanks for detailing 💯
This would be a neat solution, but I think it'd still require changing the type of the upstream and the migration you proposed to happen there (again, this is not a huge deal, will just require more care) because there's no nicer place to perform that migration:
sources
) but I think it's less work to migrate the upstream (we also have to practice evolving it anyways)However I'm still unsure about this, as I have the feeling that it would incentivize a "split" in the ecosystem by changing the relationship "one repo == one package" - even if we can make Bower compatible, psc-package
is not and AFAIK will not be patched.
So I'd probably feel more comfortable talking about this after Spago matures a bit and gets more usage - I guess "after 1.0" would be a good time to consider this again?
@joneshf a temporary, low-effort solution for now would be to move your packages to the monorepo, but keep the repo there for publishing them. Then you can add a script to the repo that just pulls the monorepo, focuses on the package you need, and copies that to the root of the repo
Well that's unfortunate, but understandable. I'm not upset about it or anything, but I need to find a solution to my hundreds of repos. I'm mostly lazy, don't want to write a bunch of git-based stuff, but still want people to be able to consume the packages I make. I want to live in a spago
-only world, but I don't want to live there by myself. I'm either going to trick spago
into working somehow, or use something else.
psc-package
is not and AFAIK will not be patched.
Wait, what does this mean? Does this mean the decision for spago
relies on psc-package
supporting source files existing in directories other than src
as well?
If so, the change to psc-package
would be adding a similar sources :: [Text]
to PackageInfo, and using that here, here, and here. That seems feasible to implement, and I'm more than willing to submit the PR if it means spago
gains the ability to support files not in src
.
The more I think about it, the more I realize that I came in here and asked the wrong question. This was never about monorepos, but always about supporting dependencies where the source files weren't in src
. This has been an area of accidental complexity that PureScript had for years that I think has made things harder in every packaging solution we've come up with for no real reason. As mentioned above, it's understandable if you don't want to address this right now (or at all), but this issue is bigger than support for monorepos and I'm going to update the title of this issue to reflect that.
Wait, what does this mean? Does this mean the decision for spago relies on psc-package supporting source files existing in directories other than
src
as well?
Yes, because I want the ecosystem to reasonably move in sync and avoid breakages/splits/etc (the 0.12
transition was not fun so it's ok if we take extra care in this regard)
So until psc-package
is officially supported I'll consider it "part of the ecosystem"
Just to make it clear I'd agree with this change, but I'd like agreement from the wider community too so we can sync up if we need to make changes (because it's about changing one of the basic assumptions of packaging in PureScript)
Ping @justinwoo @hdgarrood
This has been an area of accidental complexity that PureScript had for years that I think has made things harder in every packaging solution we've come up with for no real reason
Can you give some examples (outside of the feature being requested in this issue)? I very much disagree with this, actually; I think the more things you have which are configurable, the more things other tools need to worry about when they are consuming packages, so we should only allow configurability if we are certain that it is absolutely necessary. In this case, I think requiring source files to be in src/
drastically simplifies the task of compiling your project together with its dependencies, because if you were allowed to put source files anywhere you wanted, you'd have to read and parse a package manifest file for every single package you depend on to find out what that location was.
My preferred way forward here would be to stop having packages necessarily tied to git repositories, so that you could have one repo containing a bunch of packages which you can publish individually. That probably entails using a real package registry and distributing packages as tarballs or something.
Thinking a bit more about the underlying problem, there's a parallel decision we made and changed: assuming a package will be in a purescript-
repo.
It's another arbitrary decision we made in the past that was also neither free nor intuitive. It made things easier at the time because we had a handful of packages, and we were prefixing all of them with purescript-
. So, we continued to write tooling around that decision. When the psc-package
manifest was made, there was a decision to drop this convention and make the package name configurable and explicit. It was an easy change, it was a non-breaking change, it didn't disrupt the PS ecosystem. But, it meant that you no longer had to create a repo prefixed with the name purescript-
in order to use psc-package
–and now spago
. psc-package
could have kept with the past decision that it would only find files within the purecript-
-prefixed repo, but it didn't.
That change hasn't stopped anything in the ecosystem from growing, nor has it created a bifurcation. In fact, the vast majority of repos still start with purescript-
, and they have to if they want to work with pulp
simultaneously. I've used this feature of psc-package
in the past, and format-nix uses it now. BTW, format-nix
is available on both bower and pursuit.
I understand and respect wanting to be sure that this change doesn't adversely affect the community. I think psc-package
proved that allowing a manifest with more data, then subsequently using that data to make a more robust tool doesn't propagate back into the ecosystem in a negative way. The vast majority of people default to following the convention, and the one or two people that need something different can do that as well.
@f-f Given that we have an example of changing a similar convention of the past into an explicit configuration in effect for years without any negative consequences on the ecosystem, does that change your thoughts on consuming packages with files not in src
?
Can you give some examples (outside of the feature being requested in this issue)?
I won't give any more examples because the way this request has played out in the past is that I attempt to justify my thoughts, and they're dismissed. I know you can be convinced of other ideas, but I don't want to spend the time convincing you of this one. I'm fine with us having different opinions.
My preferred way forward here would be to stop having packages necessarily tied to git repositories
I'm fine with that as well. Having fleshed out what the actual problem is though, allowing to specify where source files exist sounds way easier than breaking the git convention.
I won't give any more examples because the way this request has played out in the past is that I attempt to justify my thoughts, and they're dismissed. I know you can be convinced of other ideas, but I don't want to spend the time convincing you of this one. I'm fine with us having different opinions.
Having different opinions is of course fine, but I think that adding a feature which would enable the possibility of packages which can only be consumed by spago should ideally come with proper justification.
I'm fine with that as well. Having fleshed out what the actual problem is though, allowing to specify where source files exist sounds way easier than breaking the git convention.
Moving away from git will indeed be a lot of effort, but a proper package registry is very desirable for various reasons, which are discussed in https://discourse.purescript.org/t/blogged-thoughts-on-purescript-package-management/809. I think we need to do it at some point anyway.
format-nix uses it now. BTW, format-nix is available on both bower and pursuit.
Just to clarify, I did make sure format-nix can be installed and used via bower. It's registered on bower as purescript-format-nix, and installs into that directory structure and can be built with pulp accordingly:
$ fd format bower_components
bower_components/purescript-format-nix
bower_components/purescript-format-nix/src/FormatNix.js
bower_components/purescript-format-nix/src/FormatNix.purs
bower_components/purescript-prelude/src/Data/NaturalTransformation.purs
I think that adding a feature which would enable the possibility of packages which can only be consumed by spago should ideally come with proper justification.
Can we not use the same justification as what was used for psc-package
s current format? As mentioned above, the psc-package
format allows anyone to create a package that only lives in the psc-package
/spago
world; i.e. bower
and pulp
cannot consume it. And that possibility has existed for years.
I don't find this example particularly compelling, because for a package like that, all you would need to do is to put the purescript-
prefix in the name
field in your bower.json file (as @justinwoo mentioned above). The name
field in bower.json
isn't required to match the github repo name or the name you give to a package in a package set, so it's fine to put the purescript-
prefix in bower.json but not anywhere else.
I'm sorry you're not compelled by it, but it's showing that the ecosystem can be trusted to not break everything. You just outlined how the ecosystem can still keep it together even though psc-package
has provided the ability for the ecosystem to bifurcate for years. Nobody using psc-package
or spago
has to support bower
and pulp
if they don't want to. But the people that want to take the extra freedom afforded by psc-package
and put some glue into their workflow can still support consumption by bower
and pulp
.
The community can make similar glue if their source files are allowed to be in a different place. One example is to publish from a new repo, as @f-f outlined above. Another example is to tag a commit where the files are moved to where bower
and pulp
expect them. I'm sure there are even more ways. If people still want to support bower
and pulp
, they can do that and they can do that with their source files in a different location than src
for day to day work. bower
doesn't have to change, pulp
doesn't have to change, pulp
doesn't have to grow a manifest file, pursuit
doesn't have to do anything special. None of the rest of the ecosystem needs to be affected by psc-package
/spago
being able to consume a package with files not in src
. If you don't find that compelling, I don't think anything else I can say that will compel you.
Ultimately, the decision comes down to @f-f. I've made my case, discussed how the ecosystem has stayed in sync even with the ability to bifurcate. You've weighed in with your thoughts about pulp
. If @f-f wants to do this, cool. I'll help in whatever capacity I can. If not, I'll figure something else out.
@f-f Given that bower
changed the registry to no longer support creating new packages, and confirmed it in a follow-up issue, where do we sit on this issue? The bower
/pulp
workflow for new packages doesn't work anymore. Does this change in bower
s usefulness in the PS ecosystem allow this issue to start moving forward?
@joneshf yes. Since we'll have control over the new registry we can (and I think we should) make this configurable, since we'll be packaging sources in a tarball anyways. We can leave this issue open or close it and move the discussion over to that repo, since the implementation details will have to be ironed out there first
Sweet! Excited to see what comes of the registry.
Maybe keep this issue open until there's a concrete way forward, if that's alright?
@joneshf there is now an issue tracking this discussion in the registry repo: https://github.com/purescript/registry/issues/16
I would say we should close this issue, as that one addresses the upstream concern and this one will come for free once that's in place - i.e. since the Registry and Spago will use the same schema, once we can publish packages with sources in a different place, then Spago will be able to use them. Makes sense?
I'm okay with that.
The current title "How to consume a package with source files not in src" was confusing to me, the original "How to consume a monorepo" makes more sense (to my view).
First, because I think that the location of sources inside the package is definitely better to be restricted to make things more standard at the top, and the packages, in general, should be a simple thing as it possible for them to be (which now they are certainly not in many cases).
The second thing, definitely a package should be decoupled from a git repo, but currently, as the ecosystem in most cases relies upon git repo version tags, it would be more difficult to keep things in sync with a monorepo, even if not to consider the "bower compatibility". At the same time, I think that eventually having the ability to pull packages content from git repo subdirectories could be still valuable even along with the existing registry (as not everything that could be consumed should be published in the registry).
So currently If I had a number of separate packages that I would choose to reside in monorepos, I would make a Github Org and published (pushed there) package's code with appropriate version tags (as @f-f already proposed here). This would be definitely a non-standard (and even kinda hacky) solution and would require some additional setup, but it could be quite ok (esp. for someone who would like to publish hundreds of one's packages @joneshf ;-))
In the readme, it mentions that
spago
is about supporting monorepos. It seems to support producing a monorepo fairly well. However, it's not readily clear how to consume a monorepo. It seems like everything is hardcoded to expect files to exist in thesrc
directory at the root of a git repo. Is there any plan to support files existing at a different directory? Say someone had a repo with the structure laid out in the README:Is there any way to consume
lib/src/Main.purs
from outside of the repository? If the files are local to the machine, it seems like it can work. If the files are remote to the machine (like hosted on GitHub), it doesn't seem like we can currently consume it.Any thoughts?