JuliaLang / Pkg.jl

Pkg - Package manager for the Julia programming language
https://pkgdocs.julialang.org
Other
622 stars 264 forks source link

Feature request: Store multiple registered packages in a single Git repository #1251

Closed DilumAluthge closed 4 years ago

DilumAluthge commented 5 years ago

Currently, a single Git repository can only contain one registered package, which must live in the top folder of the Git repository.

I would like to request the ability to store more than one registered package in the same Git repository.

For example, I might have two packages Foo and Bar that I store in a single GitHub repository DilumAluthge/FooBar.git. In this case, the contents of https://github.com/DilumAluthge/FooBar.git might look something like this:

FooBar
├── Bar
│   ├── Project.toml
│   ├── src
│   │   └── Bar.jl
│   └── test
│       └── runtests.jl
└── Foo
    ├── Project.toml
    ├── src
    │   └── Foo.jl
    └── test
        └── runtests.jl

6 directories, 6 files

For example, the contents of https://github.com/DilumAluthge/FooBar.git might instead look something like this:

FooBar
├── myfolder
│   └── Foo
│       ├── Project.toml
│       ├── src
│       │   └── Foo.jl
│       └── test
│           └── runtests.jl
└── otherfolder
    └── subfolder
        └── Bar
            ├── Project.toml
            ├── src
            │   └── Bar.jl
            └── test
                └── runtests.jl

9 directories, 6 files

Possibly related issues/pull requests:

ChrisRackauckas commented 4 years ago

Why would an Ad library has to be tagged in lock step with Flux? And why would stuff start to fail if only one of these are tagged. This just looks like bad developer practice and has nothing to do with Pkg. Define a proper API and expose it and don't break stuff all the time?

🤦 . It's following all of the good developer practices that were laid out for the auto-merge era. It gave an update to Adapt 2.0, an update to Zygote that wants that, and then when Pkg decides to serve that, everything else falls back to a version that was either before it used Adapt or before they used good versions, so you get a version of Tracker that brings the whole world down until all of the Adapt 2.0 updates go through, and any time you add a package that bounds Adapt 2.0 Pkg finds this as a valid solution.

So why does everything need to move in lock step? Because our good developer practices easily leads to these situations 🤷 . I can know that this nonsense leads to 2% of DiffEq failing tests, but instead, good developer practice means that 100% of DiffEq/Turing/Pumas/etc. is unusable with the current standard registry. I think the most salient issue of the current situation is that good developer practices is unable to decouple the issues of Flux from the rest of SciML because of the hard bounds. A "meh" solution isn't even possible without coordinating 50 other packages outside SciML to join in, so instead every time Flux makes a move we just have to sit back and wait for the whole ecosystem to sort itself out again.

timholy commented 4 years ago

How do the CI triggers work?

Right now I run all tests as part of the "outer" package (SnoopCompile is not huge). There, my intention is to have everything advance in lock-step. With JuliaImages (which is bigger, though not as big as your ecosystem) I might want to have the ability to run a subset of tests, not sure. We've often thought about running tests for "meta" packages when updating focused packages and vice versa, e.g., https://github.com/JuliaImages/Images.jl/pull/468. I seem to remember others but can't find them now, maybe @johnnychen94 knows. Even though a CI run of everything would take quite a long time, it would have advantages.

And are the dependencies per sub-package?

Yes, in that SnoopCompile repo you can see each subpackage (just a subdirectory of the top-level repo) has its own Project.toml file. Most dependencies of the sub-packages are not declared as direct dependencies of the wrapper package.

johnnychen94 commented 4 years ago

@timholy Do you mean https://github.com/JuliaImages/Images.jl/issues/805 (and probably the docs https://github.com/JuliaImages/Images.jl/pull/843)

clarkevans commented 4 years ago

I attempted to explain a package-dependency use case for DataKnots -- basically, we have glue packages we'd like in a monorepo. We would like a top-level documentation and CI tests inclusive of all optional modules (JSON, XML, SQL, etc.); this top-level need isn't a Julia package, or at least not one you'd want to install. It would be nice to be able to commit the entire ecosystem with a single version, even though they are modeled as separate Julia packages. Tim, is this similar to your situation?

timholy commented 4 years ago

Here's another example, one which you're familiar with @KristofferC. There's a big rewrite of Revise brewing in https://github.com/timholy/Revise.jl/pull/497. Because Revise is used by lots of people who seem to have very different workflows, I'd really like to get some beta testers before inflicting it on the world. Revise depends on CodeTracking, JuliaInterpreter, and LoweredCodeUtils, and to make the rewrite possible all of them need changes. Because we have upper bounds---and because I know the difficult process of getting testers will be even harder if I have to tell people to dev particular branches from several different repos---I say, "I know, I can use the fact that we have upper bounds and release v1.0 of CodeTracking and LoweredCodeUtils, and Pkg will handle everything automatically."

Except it doesn't. In addition to https://github.com/timholy/Revise.jl/issues/508, there's also the issue that Pkg's resolver notices the fact that it can increase the version on two packages, CodeTracking and LoweredCodeUtils, if it lowers the version of Revise to something that depends on neither. So if you happen to add CodeTracking LoweredCodeUtils Revise in a clean environment, here's what you get:

   Updating `/tmp/pkgs/environments/v1.4/Project.toml`
  [da1fd8a2] + CodeTracking v1.0.0
  [6f1432cf] + LoweredCodeUtils v1.1.2
  [295af30f] + Revise v1.0.3

Revise 1.0.3 is really old. Now, if you only add Revise this doesn't happen, but if you're a developer and you're switching between devved and freed versions of packages constantly as you ~develop new features~break stuff and need to go back to a working version temporarily, you run into this all the time.

That why git branches feel like the right way to develop new features; if Revise, JuliaInterpreter, Debugger, CodeTracking, and LoweredCodeUtils were all part of one giant git repo, I could just say "check out this one branch" and everything is done. Then when it's time to release I can release them all, but until then I don't need to trouble the rest of the world with compatibility issues.

So I'm really excited about this new feature Pkg devs have developed! I am just pointing out that I think we're still missing some tooling to make this work properly.

timholy commented 4 years ago

Tim, is this similar to your situation?

Yes, one of them, particularly our JuliaImages ecosystem. Images is basically a meta-package. There's also the issue of our documentation, which is a separate repo and deploys docs for the whole ecosystem. We've wrestled a lot with questions like "should focused packages host their own docs, or should the whole ecosystem be documented as a whole?" Again, one of the really huge things about this new development is we won't have to make that choice any more: if all the focused packages are in the same github repo as the meta package, then there's really only one documentation site.

fredrikekre commented 4 years ago

I still have not understood how this discussion relate to the possibility of having multiple packages in the same repo. It seems like all the same constraints apply to having the packages in multiple repos.

timholy commented 4 years ago

There are two separate points:

Based partly on the confusion of folks who understand Pkg better than I do, I've reconsidered how I'm thinking now, and I think the answer to the first one is "no." But, it's not how I thought about it when starting out to split SnoopCompile. The "right" way, as I think you're espousing, to do it is to (1) duplicate the code you need to create SubPkgA (important: do not delete the copy that lives in the old monolithic BigPkg), get it through CI, and register it; (2) after 3 days, duplicate the code you need to create SubPkgB, get it through CI, and register it; ... (n) after 3 days, delete the duplicated code from BigPkg, make it depend on all the subpackages, and register the new version. This works. But for me that was counterintuitive because my first thought was to aim for the final state I wanted, not craft the strategy in terms of how to get it past CI & Pkg registration. And of course it's very slow (but this is no different from doing it in separate repos).

The answer to the second question is "yes, we're definitely failing to take advantage of a huge opportunity to do better." To explain this point clearly, I've submitted #1874.

ma-laforge commented 4 years ago

It sounds like people have different views as to why sub-packages are good to have. The following describes my own use case:

Why do I want all sub-packages in a git repo to have a single "repo-wide" version number?

Because every "project" in that repo should be somewhat related, and would often build on its neighboring packages. Consequently, it is just easier to publish a single, monolithic package that will release all sub-packages simultaneously, with the same version number.

So, it appears I have much in common with @timholy in this respect.

Side-benefit of registering all subpackages simultaneously

If everything got registered simultaneously in one PR to the GeneralRegistry, could I not trigger testing of all packages from the root package, by "adding" sub-packages to the "test" "[target]" on that root Project.toml file? That should install their dependencies correctly, and make the CI test environment ready for the sub-packages as well.

So why not just make a sigle package?

(Question from @fredrikekre https://github.com/JuliaLang/Pkg.jl/issues/1251#issuecomment-647103487)

Well, since the package system pulls in (downloads, builds, etc) ALL dependencies of my monolithic package, any user of my monolithic package would be forced to suffer long download, build, and compile times. I'm also pretty sure the memory footprint and time-to-first-execution of the resulting application will blow up, as well.

So I really just need these sub-packages to limit the dependency graph without resorting to having to manage a huge amount of independent Git repos, commits, and PRs to GeneralRegistry. I intend on achiving this goal by sepearating out code that "requires" external packages to said sub-packages.

MY QUESTION:

Could this somehow be done from the "targets" section of Project.toml?????

I mean, the "test" target dependencies only just kick in when you run test, right?

Maybe we could do something similar so that using BigPkg.PkgA would trigger the build process, and import the dependencies associated with these "sub-targets" (instead of calling them subpackages)?

KristofferC commented 4 years ago

Registering multiple packages in a repo with a single command just requires some additions to the registration infrastructure. I think a globbing syntax like JuliaRegistrator register subdir=* was suggested at some point.

Bumping all project versions could be done with some script.

There still doesn't seems to be a need to tweak anything in Pkg for this.

ma-laforge commented 4 years ago

@KristofferC: You might be right.

I guess my complaint a the moment stem from the fact that I seem to be unable to develop my new multi-package module locally on my computer. That's why I described my use case above. I keep running into issues where the package manager appears to need my module to be registered in JuliaRegistries/General for things to run smoothly, for some reason.

(I am using Julia 1.5-rc1)

GunnarFarneback commented 4 years ago

The bug originates from the fact that the add() function reads LOCALPATH/RootPkg/Project.toml and sees the wrong project name.

If there is a bug in the subdir functionality we want to find and fix it before 1.5 is released. What are the exact steps to reproduce it?

KristofferC commented 4 years ago

I guess my complaint a the moment stem from the fact that I seem to be unable to develop my new multi-package module locally on my computer.

I don't get that. Just dev all the packages. Just like always.

Could you make a MWE that shows the problem?

ma-laforge commented 4 years ago

Yes, I saw that dev all package one-liner that @timholy posted on the other issue. I haven't tried that yet.

Note that it does seem counter-intuitive to have to dev all packages when you technically shouldn't be aware that your repo hosts multiple packages in the currently supported workflow.

I'll try creating a MWE for you, though. I got the vague impression it was already known. Sorry.

ma-laforge commented 4 years ago

Ok. I built a sample multi-package repo on Github, to illustrate my MWE.

You can add PkgTestRoot, SubPkgA, and SubPkgB using the following:

Pkg.add(path="https://github.com/ma-laforge/PkgTestRoot.jl")
Pkg.add(path="https://github.com/ma-laforge/PkgTestRoot.jl", subdir="subpkgs/SubPkgA.jl")
Pkg.add(path="https://github.com/ma-laforge/PkgTestRoot.jl", subdir="subpkgs/SubPkgB.jl")

Now, if I want to modify one of the sub-packages, I would logically do the following:

]dev SubPkgA

But in Julia 1.5.0-rc1.0, I get the following error:

ERROR: name `PkgTestRoot` given by project file `/home/laforge/.julia/dev/SubPkgA/Project.toml` does not match given name `SubPkgA`

So it "dev"s the entire repo under the name SubPkgA, then assumes (incorrectly) that the root Project.toml file corresponds to that sub package. That's why it sees a name mismatch. It should have checked the other Project.toml file.

GunnarFarneback commented 4 years ago

Thanks, I can reproduce this.

So the scenario here is developing an unregistered subdir package by name, which is known to Pkg after having been added by URL. Clearly the code is not doing the right thing here and I've opened #1925 to track it. Meanwhile, the easiest workaround is probably to develop the subdir URL shown by Pkg.status:

(@v1.5) pkg> st
Status `/tmp/foo16/environments/v1.5/Project.toml`
  [f50ccdc4] PkgTestRoot v0.1.0 `https://github.com/ma-laforge/PkgTestRoot.jl#master`
  [2176828e] SubPkgA v0.1.0 `https://github.com/ma-laforge/PkgTestRoot.jl:subpkgs/SubPkgA.jl#master`
  [b3c70a09] SubPkgB v0.1.0 `https://github.com/ma-laforge/PkgTestRoot.jl:subpkgs/SubPkgB.jl#master`

(@v1.5) pkg> dev https://github.com/ma-laforge/PkgTestRoot.jl:subpkgs/SubPkgA.jl
ma-laforge commented 4 years ago

Thanks @GunnarFarneback. Yes that appears to work!