JuliaLang / Pkg.jl

Pkg - Package manager for the Julia programming language
https://pkgdocs.julialang.org
Other
622 stars 263 forks source link

Feature request: Store multiple registered packages in a single Git repository #1251

Closed DilumAluthge closed 4 years ago

DilumAluthge commented 5 years ago

Currently, a single Git repository can only contain one registered package, which must live in the top folder of the Git repository.

I would like to request the ability to store more than one registered package in the same Git repository.

For example, I might have two packages Foo and Bar that I store in a single GitHub repository DilumAluthge/FooBar.git. In this case, the contents of https://github.com/DilumAluthge/FooBar.git might look something like this:

FooBar
├── Bar
│   ├── Project.toml
│   ├── src
│   │   └── Bar.jl
│   └── test
│       └── runtests.jl
└── Foo
    ├── Project.toml
    ├── src
    │   └── Foo.jl
    └── test
        └── runtests.jl

6 directories, 6 files

For example, the contents of https://github.com/DilumAluthge/FooBar.git might instead look something like this:

FooBar
├── myfolder
│   └── Foo
│       ├── Project.toml
│       ├── src
│       │   └── Foo.jl
│       └── test
│           └── runtests.jl
└── otherfolder
    └── subfolder
        └── Bar
            ├── Project.toml
            ├── src
            │   └── Bar.jl
            └── test
                └── runtests.jl

9 directories, 6 files

Possibly related issues/pull requests:

StefanKarpinski commented 5 years ago

As far as installing package goes, this already works. Where is the support lacking? Real question: what doesn't work when you try it?

DilumAluthge commented 5 years ago

I've set up a working example.


Minimum working example

The Git repository FooBar that contains both the Foo.jl and Bar.jl packages:

The registry TestRegistry:

Pkg.add works:

Each of the following code blocks runs successfully.

julia> using Pkg
julia> Pkg.Registry.add("General")
julia> Pkg.Registry.add(Pkg.RegistrySpec(url="https://github.com/DilumAluthge/TestRegistry.git"))
julia> Pkg.add("Foo")
julia> using Foo
julia> Foo.double(5)
julia> Pkg.test("Foo")
julia> using Pkg
julia> Pkg.Registry.add("General")
julia> Pkg.Registry.add(Pkg.RegistrySpec(url="https://github.com/DilumAluthge/TestRegistry.git"))
julia> Pkg.add("Bar")
julia> using Bar
julia> Bar.triple(5)
julia> Pkg.test("Bar")
julia> using Pkg
julia> Pkg.Registry.add("General")
julia> Pkg.Registry.add(Pkg.RegistrySpec(url="https://github.com/DilumAluthge/TestRegistry.git"))
julia> Pkg.add("Foo")
julia> Pkg.add("Bar")
julia> using Foo
julia> using Bar
julia> Foo.double(5)
julia> Bar.triple(5)
julia> Pkg.test("Foo")
julia> Pkg.test("Bar")

Pkg.develop does not work:

Pkg.develop("Foo") does not work:

julia> using Pkg

julia> Pkg.Registry.add("General")
   Cloning registry from "https://github.com/JuliaRegistries/General.git"
     Added registry `General` to `~/.julia/registries/General`

julia> Pkg.Registry.add(Pkg.RegistrySpec(url="https://github.com/DilumAluthge/TestRegistry.git"))
   Cloning registry from "https://github.com/DilumAluthge/TestRegistry.git"
     Added registry `TestRegistry` to `~/.julia/registries/TestRegistry`

julia> Pkg.develop("Foo")
  Updating registry at `~/.julia/registries/General`
  Updating git-repo `https://github.com/JuliaRegistries/General.git`
  Updating registry at `~/.julia/registries/TestRegistry`
  Updating git-repo `https://github.com/DilumAluthge/TestRegistry.git`
   Cloning git-repo `https://github.com/DilumAluthge/FooBar.git`
  Updating git-repo `https://github.com/DilumAluthge/FooBar.git`
 Resolving package versions...
  Updating `~/.julia/environments/v1.1/Project.toml`
  [ac29586c] + Foo v0.1.0+ [`~/.julia/dev/Foo`]
  Updating `~/.julia/environments/v1.1/Manifest.toml`
  [ac29586c] + Foo v0.1.0+ [`~/.julia/dev/Foo`]
  [2a0f44e3] + Base64
  [8ba89e20] + Distributed
  [b77e0a4c] + InteractiveUtils
  [56ddb016] + Logging
  [d6f4376e] + Markdown
  [9a3f8284] + Random
  [9e88b42a] + Serialization
  [6462fe0b] + Sockets
  [8dfed614] + Test

julia> using Foo
ERROR: ArgumentError: Package Foo [ac29586c-a683-11e9-2825-255c508e5462] is required but does not seem to be installed:
 - Run `Pkg.instantiate()` to install all recorded dependencies.

Stacktrace:
 [1] _require(::Base.PkgId) at ./loading.jl:929
 [2] require(::Base.PkgId) at ./loading.jl:858
 [3] require(::Module, ::Symbol) at ./loading.jl:853

julia> Pkg.test("Foo")
  Updating registry at `~/.julia/registries/General`
  Updating git-repo `https://github.com/JuliaRegistries/General.git`
  Updating registry at `~/.julia/registries/TestRegistry`
  Updating git-repo `https://github.com/DilumAluthge/TestRegistry.git`
ERROR: Package Foo did not provide a `test/runtests.jl` file
Stacktrace:
 [1] pkgerror(::String, ::Vararg{String,N} where N) at /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.1/Pkg/src/Types.jl:120
 [2] #test#66(::Bool, ::Function, ::Pkg.Types.Context, ::Array{Pkg.Types.PackageSpec,1}) at /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.1/Pkg/src/Operations.jl:1283
 [3] #test at /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.1/Pkg/src/API.jl:0 [inlined]
 [4] #test#46(::Bool, ::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::Pkg.Types.Context, ::Array{Pkg.Types.PackageSpec,1}) at /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.1/Pkg/src/API.jl:198
 [5] test at /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.1/Pkg/src/API.jl:183 [inlined]
 [6] #test#45 at /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.1/Pkg/src/API.jl:180 [inlined]
 [7] test at /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.1/Pkg/src/API.jl:180 [inlined]
 [8] #test#44 at /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.1/Pkg/src/API.jl:179 [inlined]
 [9] test at /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.1/Pkg/src/API.jl:179 [inlined]
 [10] #test#43(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::String) at /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.1/Pkg/src/API.jl:178
 [11] test(::String) at /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.1/Pkg/src/API.jl:178
 [12] top-level scope at none:0

Pkg.develop("Bar") does not work:

julia> using Pkg

julia> Pkg.Registry.add("General")
   Cloning registry from "https://github.com/JuliaRegistries/General.git"
     Added registry `General` to `~/.julia/registries/General`

julia> Pkg.Registry.add(Pkg.RegistrySpec(url="https://github.com/DilumAluthge/TestRegistry.git"))
   Cloning registry from "https://github.com/DilumAluthge/TestRegistry.git"
     Added registry `TestRegistry` to `~/.julia/registries/TestRegistry`

julia> Pkg.develop("Bar")
  Updating registry at `~/.julia/registries/General`
  Updating git-repo `https://github.com/JuliaRegistries/General.git`
  Updating registry at `~/.julia/registries/TestRegistry`
  Updating git-repo `https://github.com/DilumAluthge/TestRegistry.git`
   Cloning git-repo `https://github.com/DilumAluthge/FooBar.git`
  Updating git-repo `https://github.com/DilumAluthge/FooBar.git`
 Resolving package versions...
  Updating `~/.julia/environments/v1.1/Project.toml`
  [f73876a8] + Bar v0.1.0+ [`~/.julia/dev/Bar`]
  Updating `~/.julia/environments/v1.1/Manifest.toml`
  [f73876a8] + Bar v0.1.0+ [`~/.julia/dev/Bar`]
  [2a0f44e3] + Base64
  [8ba89e20] + Distributed
  [b77e0a4c] + InteractiveUtils
  [56ddb016] + Logging
  [d6f4376e] + Markdown
  [9a3f8284] + Random
  [9e88b42a] + Serialization
  [6462fe0b] + Sockets
  [8dfed614] + Test

julia> using Bar
ERROR: ArgumentError: Package Bar [f73876a8-a683-11e9-3353-27bcedd920b0] is required but does not seem to be installed:
 - Run `Pkg.instantiate()` to install all recorded dependencies.

Stacktrace:
 [1] _require(::Base.PkgId) at ./loading.jl:929
 [2] require(::Base.PkgId) at ./loading.jl:858
 [3] require(::Module, ::Symbol) at ./loading.jl:853

julia> Pkg.test("Bar")
  Updating registry at `~/.julia/registries/General`
  Updating git-repo `https://github.com/JuliaRegistries/General.git`
  Updating registry at `~/.julia/registries/TestRegistry`
  Updating git-repo `https://github.com/DilumAluthge/TestRegistry.git`
ERROR: Package Bar did not provide a `test/runtests.jl` file
Stacktrace:
 [1] pkgerror(::String, ::Vararg{String,N} where N) at /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.1/Pkg/src/Types.jl:120
 [2] #test#66(::Bool, ::Function, ::Pkg.Types.Context, ::Array{Pkg.Types.PackageSpec,1}) at /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.1/Pkg/src/Operations.jl:1283
 [3] #test at /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.1/Pkg/src/API.jl:0 [inlined]
 [4] #test#46(::Bool, ::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::Pkg.Types.Context, ::Array{Pkg.Types.PackageSpec,1}) at /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.1/Pkg/src/API.jl:198
 [5] test at /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.1/Pkg/src/API.jl:183 [inlined]
 [6] #test#45 at /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.1/Pkg/src/API.jl:180 [inlined]
 [7] test at /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.1/Pkg/src/API.jl:180 [inlined]
 [8] #test#44 at /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.1/Pkg/src/API.jl:179 [inlined]
 [9] test at /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.1/Pkg/src/API.jl:179 [inlined]
 [10] #test#43(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::String) at /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.1/Pkg/src/API.jl:178
 [11] test(::String) at /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.1/Pkg/src/API.jl:178
 [12] top-level scope at none:0

@JuliaRegistrator register does not work:

Commenting @JuliaRegistrator register on a commit in FooBar does not work. It gives the following error: Error while trying to register: File Project.toml not found. See here: https://github.com/DilumAluthge/FooBar/commit/9c7e8cf8c2d3e3a022fe4292e64171892920b265

StefanKarpinski commented 5 years ago

Thanks @DilumAluthge, that helps push this forward considerably. So the question with Pkg.develop is: What do we want this to do? What should developing a package that is not at the root of its repo do? For Registrator, I think it's pretty straightforward: we should look for project files anywhere in the repo. However, if there's more than one, which one is supposed to be registered? I had originally envisioned that Registrator would work by just looking for commits changing the version number in any project file and then considering that commit to be a trigger for registering the new version of the package where the project file lives. @nkottaary, @KristofferC, any thoughts?

DilumAluthge commented 5 years ago

What should developing a package that is not at the root of its repo do?

For developing packages that are not at the root of the repo, we could require that the user provide the path (relative from the repo root) to the package. This could be a new field in Pkg.Types.PackageSpec named subdirectory:

Pkg.develop(Pkg.PackageSpec(name = "Foo", subdirectory = "myfolder"))
Pkg.develop(Pkg.PackageSpec(name = "Bar", subdirectory = "otherfolder/subfolder"))

Alternatively, it could be a keyword argument to Pkg.develop:

Pkg.develop("Foo"; subdirectory = "myfolder")
Pkg.develop("Bar"; subdirectory = "otherfolder/subfolder")

For Registrator, I think it's pretty straightforward: we should look for project files anywhere in the repo. However, if there's more than one, which one is supposed to be registered?

For registrator, if the project file is not located in the project root, we could similarly require the user to supply the location of the package within the repo. For example, this would register Foo.jl:

@JuliaRegistrator register branch=master subdirectory=myfolder

And this would register Bar.jl:

@JuliaRegistrator register branch=master subdirectory=otherfolder/subfolder
fredrikekre commented 5 years ago

For developing packages that are not at the root of the repo, we could require that the user provide the path (relative from the repo root) to the package. This could be a new field in Pkg.Types.PackageSpec named subdirectory:

It should be relatively easy to find the subdirectory by looking for the correct Project.toml file.

Edit: I guess just a URL is not enough, but url + name?

Pkg.develop(PackageSpec(name = "Foo", url = "https......"))
KristofferC commented 5 years ago

Thanks for the excellent summary @DilumAluthge. I personally like the explicitness of providing the relative path to the package in question. Like you said, this could be used both for Registrator and develop.

StefanKarpinski commented 5 years ago

Aside from the keyword subdirectory being a bit long, that seems like a good option. Maybe dir= or path= or root= or subdir=? We could also support passing name= to find the project file by name. These could be alternative ways to specify. (If someone passes both, error if the package at given root doesn't have the expected name?)

Alternatively, what if used URL fragment identifiers here? I.e.

https://github.com/user/repo#path/to/package
https://github.com/user/repo#name

We could interpret the fragment as a path first, look at that path for a project file; failing that, if the name doesn't have / in it, look through all the project files for one with the fragment as a name.

KristofferC commented 5 years ago

We kinda used # to mean branch already in the sense add url#branch.

I think it is fine to start simple with only providing the subdir option and then if we feel the need we can add more automagic ways.

tkf commented 5 years ago

How about using the query string?

https://github.com/user/repo?subdir=path/to/package
https://github.com/user/repo?name=Foo

(I'm borrowing the idea from pip https://pip.pypa.io/en/stable/reference/pip_install/#vcs-support )

clarkevans commented 5 years ago

This is a feature we'd also like to have. For DataKnots we'll be growing an ecosystem of adapters of various sorts. Each one of it has glue code, and it'd be rather silly to have a unique repository for each since it complicates releases and makes it harder to find what you are after. Has there been any progress lately on making this work? This topic was asked last year on the forums.

StefanKarpinski commented 5 years ago

Marking for triage to be discussed on the next Pkg call so that we can make a decision here. This shouldn't be too hard to do, we just need to decide how to express it.

clarkevans commented 5 years ago

Hurray... there is a WIP, https://github.com/JuliaLang/Pkg.jl/pull/1422 ;)

fingolfin commented 4 years ago

And hurray, https://github.com/JuliaLang/Pkg.jl/pull/1422 was merged! Progress! I am really hoping this can make it into the next LTS

Mentors4EDU commented 4 years ago

Also, referencing pull request #11169

KristofferC commented 4 years ago

Implemented

GunnarFarneback commented 4 years ago

Some further pointers:

racinmat commented 4 years ago

Hi, it's great to see this implemented, I have few questions to it:

KristofferC commented 4 years ago

is registration of subdirectories supported by WebUI of Registrator.jl?

I don't think so.

is it supported by CompatHelper?

Looks like it https://github.com/bcbi/CompatHelper.jl/pull/195

how does TagBot handle this

Not sure, there is some discussion about it here https://github.com/JuliaRegistries/Registrator.jl/issues/230#issuecomment-612683267.

fingolfin commented 4 years ago

Regarding TagBot, the plan is to use tags of the form "PACKAGENAME-v1.2.3" for package in subdirs; see https://github.com/JuliaRegistries/Registrator.jl/issues/230#issuecomment-612689810 and https://github.com/JuliaRegistries/Registrator.jl/pull/287#issuecomment-618758236

I don't think WebUI support for subdirs is there (yet), based on https://github.com/JuliaRegistries/Registrator.jl/pull/287#issuecomment-619398401 but that could come later

I don't know about CompatHelper (it may already have been discussed somewhere, and I missed it). But given that e.g. @DilumAluthge has comment on this issue, I am hopefully it'll support this eventually :-) UPDATE: as @KristofferC points out (our comments crossed), this is apparently done in https://github.com/bcbi/CompatHelper.jl/pull/195 :-)

GunnarFarneback commented 4 years ago

This is included in Julia v1.5.0-beta1. Please try it out to see if it works as expected.

clarkevans commented 4 years ago

OK. I'd love to test it. How? So, we have two packages, DataKnots.jl and DataKnots4Postgres.jl -- currently each are in their own git repository. I verified both work if installed separately. Then, I removed them. Then I created a pkg folder in DataKnots.jl and moved the DataKnots4Postgres code over, removing the .git sub-directory since the idea is to permit a mono-repository. Other than that, I made no changes (should I?). It seems develop groks sub-packages and testing works just fine. What do I do next?

$ ls DataKnots.jl/
CODE_OF_CONDUCT.md  LICENSE.md  pkg           README.md  test
doc                 NEWS.md     Project.toml  src
$ ls DataKnots.jl/pkg/DataKnots4Postgres/
doc  LICENSE.md  NEWS.md  Project.toml  README.md  src  test
(@v1.5) pkg> develop /home/cce/DataKnots.jl
Path `/home/cce/DataKnots.jl` exists and looks like the correct package. Using existing path.
  Resolving package versions...
Updating `~/.julia/environments/v1.5/Project.toml`
  [f3f2b2ad] ~ DataKnots v0.10.0 `/home/cce/DataKnots.jl#subpkg` ⇒ v0.10.0 `~/DataKnots.jl`
Updating `~/.julia/environments/v1.5/Manifest.toml`
  [f3f2b2ad] ~ DataKnots v0.10.0 `/home/cce/DataKnots.jl#subpkg` ⇒ v0.10.0 `~/DataKnots.jl`

(@v1.5) pkg> develop /home/cce/DataKnots.jl/pkg/DataKnots4Postgres/
Path `/home/cce/DataKnots.jl/pkg/DataKnots4Postgres/` exists and looks like the correct package. Using existing path.
  Resolving package versions...
Updating `~/.julia/environments/v1.5/Project.toml`
  [ee1fa34a] + DataKnots4Postgres v0.1.0 `~/DataKnots.jl/pkg/DataKnots4Postgres`
Updating `~/.julia/environments/v1.5/Manifest.toml`
  [ee1fa34a] + DataKnots4Postgres v0.1.0 `~/DataKnots.jl/pkg/DataKnots4Postgres`

(@v1.5) pkg> ^C
julia> using DataKnots
[ Info: Precompiling DataKnots [f3f2b2ad-91c8-5588-b964-d77e2d3bb090]
using Data
julia> using DataKnots4Postgres
[ Info: Precompiling DataKnots4Postgres [ee1fa34a-8a34-11e9-034b-7bfccbfac8bd]

$ PGHOST=/var/run/postgresql julia pkg/DataKnots4Postgres/test/runtests.jl 
Tests passed: 8
TESTING SUCCESSFUL!

I'll need to setup a local registry to test the next parts?

GunnarFarneback commented 4 years ago

Then I created a pkg folder in DataKnots.jl and moved the DataKnots4Postgres code over, removing the .git sub-directory since the idea is to permit a mono-repository. Other than that, I made no changes (should I?).

You should add the new folder to the top level git repository, if you haven't already.

I'll need to setup a local registry to test the next parts?

It certainly becomes more interesting when you have the packages in a registry, see https://github.com/GunnarFarneback/LocalRegistry.jl/blob/master/docs/subdir.md for how to do it with LocalRegistry.

You can also try to Pkg.add it from your local path with

pkg> add /home/cce/DataKnots.jl:pkg/DataKnots4Postgres/

(Notice the colon that tells Pkg what part is the git directory and what part is the subdirectory. With develop this distinction isn't really relevant and it should work with both colon and slash.)

clarkevans commented 4 years ago

OK. I'll read more about local registries. With regard to Pkg.add this didn't seem to work.

(@v1.5) pkg> add /home/cce/DataKnots.jl:pkg/DataKnots4Postgres
   Updating git-repo `/home/cce/DataKnots.jl`
ERROR: Did not find subdirectory `pkg/DataKnots4Postgres`

Update: after the sub-package was committed, Pkg.add worked.

(@v1.5) pkg> add /home/cce/DataKnots.jl:pkg/DataKnots4Postgres
   Updating git-repo `/home/cce/DataKnots.jl`
  Resolving package versions...
Updating `~/.julia/environments/v1.5/Project.toml`
  [ee1fa34a] + DataKnots4Postgres v0.1.0 `/home/cce/DataKnots.jl:pkg/DataKnots4Postgres#subpkg`
Updating `~/.julia/environments/v1.5/Manifest.toml`
  [ee1fa34a] + DataKnots4Postgres v0.1.0 `/home/cce/DataKnots.jl:pkg/DataKnots4Postgres#subpkg`
(@v1.5) pkg> test DataKnots4Postgres
    Testing DataKnots4Postgres
...
TESTING SUCCESSFUL!
    Testing DataKnots4Postgres tests passed 

Thank you for all the help. Having a monorepo will be a huge improvement.

KristofferC commented 4 years ago

You rarely want to use add for local packages. If you do, they act like normal git repositories so you need to have committed all your changes. Here you could just use dev /home/cce/DataKnots.jl/pkg/DataKnots4Postgres/

GunnarFarneback commented 4 years ago

Yes, the problem with the add from path is that pkg has not been added and committed to the top level git repository. This will be necessary in order to register the subdir package with LocalRegistry as well, so if nothing else it was a good way to detect a coming problem.

clarkevans commented 4 years ago

So, I'm not sure how to create a local registry. For starters, I'm pretty sure I'm not clear what create_registry actually does or why it requires two arguments. Please pardon my ignorance...

# create a brand new rbt-lang/TemporaryPackageRegistry on github... 
julia> using LocalRegistry
julia> create_registry("Testing", "git@github.com:rbt-lang/TemporaryPackageRegistry.git")
[ Info: Created registry in directory /home/cce/.julia/registries/Testing
"/home/cce/.julia/registries/Testing"

$ cd ~/.julia/registries/Testing
$ git add Registry.toml 
$ git commit -m "Empty registry" Registry.toml 
$ git push --set-upstream origin master
Enumerating objects: 3, done.
Counting objects: 100% (3/3), done.
Delta compression using up to 8 threads
Compressing objects: 100% (2/2), done.
Writing objects: 100% (3/3), 344 bytes | 344.00 KiB/s, done.
Total 3 (delta 0), reused 0 (delta 0)
To github.com:rbt-lang/TemporaryPackageRegistry.git
 * [new branch]      master -> master
Branch 'master' set up to track remote branch 'master' from 'origin'.

julia> register("/home/cce/DataKnots.jl", "Testing")
┌ Info: Registering package
│   package_path = "/home/cce/DataKnots.jl"
│   registry_path = "/home/cce/.julia/registries/Testing"
│   package_repo = "git@github.com:rbt-lang/DataKnots.jl.git"
│   uuid = UUID("f3f2b2ad-91c8-5588-b964-d77e2d3bb090")
│   version = v"0.10.0"
│   tree_hash = "35ce27b408c87ba4d178b33233db834371912175"
└   subdir = ""

julia> register("/home/cce/DataKnots.jl/pkg/DataKnots4Postgres", "Testing")
┌ Info: Registering package
│   package_path = "/home/cce/DataKnots.jl/pkg/DataKnots4Postgres"
│   registry_path = "/home/cce/.julia/registries/Testing"
│   package_repo = "git@github.com:rbt-lang/DataKnots.jl.git"
│   uuid = UUID("ee1fa34a-8a34-11e9-034b-7bfccbfac8bd")
│   version = v"0.1.0"
│   tree_hash = "8e6b8c892fae103a783de5f82b904a3e39f82d48"
└   subdir = "pkg/DataKnots4Postgres"

$ cd /home/cce/.julia/registries/Testing && git push
Enumerating objects: 20, done.
Counting objects: 100% (20/20), done.
Delta compression using up to 8 threads
Compressing objects: 100% (17/17), done.
Writing objects: 100% (18/18), 2.16 KiB | 2.16 MiB/s, done.
Total 18 (delta 1), reused 0 (delta 0)
remote: Resolving deltas: 100% (1/1), done.
To github.com:rbt-lang/TemporaryPackageRegistry.git
   27e042d..7ab46e9  master -> master

(@v1.5) pkg> remove DataKnots
...
(@v1.5) pkg> remove DataKnots4Postgres
...

(@v1.5) pkg> add DataKnots
   Updating registry at `~/.julia/registries/General`
   Updating registry at `~/.julia/registries/Testing`
   Updating git-repo `git@github.com:rbt-lang/TemporaryPackageRegistry.git`
Private key location for 'git@github.com':           
┌ Warning: Some registries failed to update:
│     — /home/cce/.julia/registries/Testing — failed to fetch from repo
└ @ Pkg.Types /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Pkg/src/Types.jl:1194
  Resolving package versions...
ERROR: hash mismatch

Perhaps this has to do with me registering the same DataKnots version...

GunnarFarneback commented 4 years ago

So, I'm not sure how to create a local registry. For starters, I'm pretty sure I'm not clear what create_registry actually does or why it requires two arguments. Please pardon my ignorance...

It creates an empty registry. It needs a name so you can refer to it and it needs a repo url since the general idea of a registry is that it should be a shared resource. For throw-away experiments you can use a file:// url on your local machine (pointing to a directory initialized with git init --bare) as an upstream repository.

Perhaps this has to do with me registering the same DataKnots version...

Yes, now you have DataKnots 0.10.0 registered both in your local registry and in General. That might be okay (not sure, haven't tried it and wouldn't recommend it) if it wasn't for the fact that they are registered with different tree hashes (i.e. contents, or in practical terms at different commits). For this experiment I would suspect that the easiest thing is to skip DataKnots from your local registry since it's already available in General and concentrate on the subdir package.

GunnarFarneback commented 4 years ago

By the way, for a monorepo approach it's likely better to have all packages in subdirectories, since otherwise the top package will include the code of all other packages.

clarkevans commented 4 years ago

Gunnar: Thanks. So, I think I'll call this test successful enough; I can do subrepositories and install them locally using "dev" or with "add". The hash conflict makes sense, since both packages are already registered... I'll also take your subdirectory suggestion under advisement. The issue is that our documentation kinda wants to be tests for the main package... so, it's inconvenient to move that down. I suppose it's only a copy of the various glue modules, that's not too much of a penalty.

timholy commented 4 years ago

Really looking forward to this! I've read the discussion here and in some of the links, but I'm still a bit confused about registering, compat bounds, and CI tests. Here's what I want to do: suppose I currently have BigPkg but I decide I need to provide access to some of the functionality without loading the entire package. So I split it into two sub-packages, PartA and PartB. BigPkg still exists, but now as a "meta-package" that loads and re-exports PartA and PartB, though either can alternatively be loaded individually. A concrete example is https://github.com/timholy/SnoopCompile.jl/pull/98.

So here are some questions/issues/observations:

aminya commented 4 years ago

To add some other points to @timholy's list:

fredrikekre commented 4 years ago

BigPkg no longer has any dependencies of its own.

If it doesn't depend on PkgA and PkgB I don't understand what the problem is? If you don't depend on DataFrames you don't have to care about DataFrames compat.

I don't quite understand the rest of the points, in particular, I don't understand how keeping them in the same repo are different from keeping them in different repos. Same things applies, e.g. you need to add by URL to use unregistered packages etc.

timholy commented 4 years ago

If it doesn't depend on PkgA and PkgB

Well, now it does, but they are not (yet) registered. The "final" state that you want has BigPkg/Project.toml that looks like

[deps]
PkgA = "uuidPkgA"
PkgB = "uuidPkgB"

but PkgA and PkgB are also being defined for the first time in the same PR. When I submit the PR, the first thing that happens is that Pkg tries to install it, but it can't find those dependencies so it barfs.

In contrast, if they are 3 separate repos then I first submit PkgA and PkgB, get them registered, and then submit the change in BigPkg to have it depend on them. The difference is the simultaneity. Mind you, I want that simultaneity: currently there are lots of hassles with cross-package development, and I look forward to a day where I can have one giant repo for, say, most of the JuliaImages code.

The deps workaround uses, for example, an empty [deps] section in BigPkg/Project.toml. That allows Pkg to install the package. Then the build script runs, and by devving the sub-packages then things can work. The awkwardness is (1) this is not documented, and (2) it does force you to test something that you will never ship, and the thing that you will ship won't test.

I know this part for sure. My last point, "what happens if you're bumping [compat] bounds?" is more of an attempt to peer into the crystal ball and anticipate whether there might be a problem. I am not sure one way or another.

timholy commented 4 years ago

What I'm after is to get away from feeling like package development is sometimes like this:

image

fredrikekre commented 4 years ago

In contrast, if they are 3 separate repos then I first submit PkgA and PkgB, get them registered, and then submit the change in BigPkg to have it depend on them.

But you can do that if they are in the same repo too. I guess I just didn't understand

Mind you, I want that simultaneity

Doesn't that suggest they should just be one package if they need to be released at the same time?

Then the build script runs, and by devving the sub-packages then things can work. The awkwardness is (1) this is not documented, and (2) it does force you to test something that you will never ship, and the thing that you will ship won't test.

The build script should not run Pkg operations. If you need to add (unregistered) packages before tests run you should just to that in the script part before Pkg.test.

More generally, I would think that you would have to set up a more sofisticated CI for a multi-package repo. For example, pre-registration you would probably want to test BigPkg with PkgA#master and PkgB#master, and after registration you probably still want to test BigPkg with PkgA#master and PkgB#master, but also with PkgA@release and PkgB@release.

timholy commented 4 years ago

But you can do that if they are in the same repo too. I guess I just didn't understand

Yes, but there's no way to also get CI to pass. When you split apart the package, CI depends on Pkg being able to do its job in order to get through testing. Imagine something like this:

Original package definition:

module MyPkg

print_status(io::IO) = print(io, "MyPkg ", status())
status() = "works"

end

and imagine we have tests for that.

Modified package definition:

module MyPkg
using MyPkgPrinting
using MyPkgPrinting: print_status
using MyPkgStatus
using MyPkgStatus: status
end

and those two functions end up in the two subpackages. Now we also have to change the Project.toml file for MyPkg to

[deps]
MyPkgPrinting = "uuid1"
MyPkgStatus = "uuid2"
...

Make a commit, submit a PR, and watch CI fail because Pkg doesn't know how to install MyPkgPrinting and MyPkgStatus. The only way to

you can do that if they are in the same repo too

is to do some weird heroics where you both define MyPkgPrinting and MyPkgStatus but also leave the original MyPkg definition in place, then get those two sub-packages registered, and come back later and rewrite MyPkg to make use of the two subpackages. Not exactly a pretty way to get the job done: after all, when I make the split I want to actually test that my split works as expected, before I register the sub-packages. I can do that locally by manually devving each sub-package but that requires ugly workarounds that force you to choose between CI and pushing a version you'll actually ship when you submit the PR. (Currently the best answer is to do both. That is, use the deps/build.jl approach to get it successfully through CI, and then fix up with Project.toml file and delete the deps/build.jl before merging. This will cause CI to fail, but hit merge anyway.)

Doesn't that suggest they should just be one package if they need to be released at the same time?

There are often technical reasons for a split. See, for example, https://github.com/timholy/SnoopCompile.jl/issues/95, where it was found that the "all one big package" design used by SnoopCompile was introducing major perturbations into the running session. The only way it could do its job correctly was to have a SnoopCompileCore that you load first to collect the data, and then you load the rest of the package to analyze the data. This is a case where the two are so strongly coupled that it makes enormous sense to co-develop them in the same repo, but my concern is that Pkg still needs more customization to make the CI & registration experience as good as it should be.

For example, pre-registration you would probably want to test BigPkg with PkgA#master and PkgB#master

But how does Pkg even install them? I guess you're saying, modify your .travis.yml or actions script to do it? And then delete those changes once registered? You're right that would work, and does get the job done more safely because the Julia part is the same as what you'll ship. Good suggestion.

But...this still kinda sucks. Many people rely on PkgTemplates precisely so they don't have to master the intricate details of CI jobs. This seems like it will eventually become such a common workflow that I don't understand why we don't do something to make repo-with-subdirs different in some way from multiple repos. For example, if I could write

name = "SnoopCompile"
uuid = "aa65fe97-06da-5843-b5b1-d5d13cad87d2"
author = ["Tim Holy <tim.holy@gmail.com>"]
version = "1.6.0"

[deps]
Pkg = "44cfe95a-1eb2-52ea-b672-e2afdf69b78f"
Serialization = "9e88b42a-f829-5b0c-bbe9-9e923198166b"

[subdir-deps]
SnoopCompileAnalysis = "9ea4277c-da97-4c3a-afb0-537c066769de"
SnoopCompileBot = "1d5e0e55-7d74-4714-b8d8-efa80e938cf7"
SnoopCompileCore = "e2b509da-e806-4183-be48-004708413034"

[subdir-version]
all = "1.6.0"
compat = "1.6" # declare bounds on all of them, so they are locked together
...

then I never have to worry about any of this.

Here's how that would change my life: when I've made big changes somewhere in the JuliaImages ecosystem, here's how it works:

The problem is that when the stack is 15 or so repositories (which JuliaImages is), and each in the best-case scenario takes half an hour (15 minutes for submission, CI, merging and 15 minutes for tagging), this process can easily eat up the entire weekend. Worse, there are CI limits on the number of jobs that run under my account, and each merge-to-master re-triggers CI, so as the weekend progresses CI gets increasingly clogged. (GitHub actions will likely make that better---worst is appveyor---but we aren't done migrating yet.) I've definitely had changes that take longer than a weekend to get through this process.

And it all feels like 100% busy-work. I have better things to do with my time. And my main point is: we finally almost have a way to fix this. We just need to add a little bit more to make the subdir support even better than it is now, and all this pain goes away: I put all of JuliaImages in one big repo, and bam the gap between me getting things running locally and me getting the whole thing through CI drops to under an hour.

DilumAluthge commented 4 years ago

Yeah we need some tooling:

  1. Tool that automatically devs all packages in repo, and tests each package.
  2. Extend Registrator to let you register all packages in a repo simultaneously. Or even fancier: register a (not necessarily proper) subset of all packages in a repo
KristofferC commented 4 years ago

Test and register the bottom part of the dependencies. Then register the "top package"?

There are often technical reasons for a split.

The snoop compile use case is extremely rare though? Almost no other package interacts with the method table like that. Also, I don't think splitting packages is in general a good idea and thus there don't need to be much tooling around it. Seeing a bunch of "split package X` prs to the registry sounds pretty bad. As made clear by jll packages, Julia scales badly with number of dependencies.

timholy commented 4 years ago

The snoop compile use case is extremely rare.

Right, but as I emphasized above I'm seeing applications well beyond SnoopCompile. In the old days, a common complaint was "Images.jl was too big of a dependency, please split it into smaller pieces." So now there are nearly a dozen sub-packages. Given Julia's latency issues, everyone seems a lot happier this way, including me most of the time, but it's frankly annoying as a maintainer who sometimes makes fundamental changes that propagate throughout the whole ecosystem. (I wonder if @ChrisRackauckas has ever felt like https://github.com/JuliaLang/Pkg.jl/issues/1251#issuecomment-646982999.)

For the common case, I'm not envisioning growing the number of packages; it's more that I'd like to recombine most of JuliaImages into a single git repo. Rumor has it that except for Android, Google puts all of their code in a single git repo. That's because they've found it makes their developers more efficient. I want that goodness but as @DilumAluthge says I think we need a bit more tooling.

ChrisRackauckas commented 4 years ago
(@v1.4) pkg> add Tracker#master Flux#master Zygote#master NNlib#master
   Updating git-repo `https://github.com/FluxML/Tracker.jl.git`
   Updating git-repo `https://github.com/FluxML/Flux.jl.git`
   Updating git-repo `https://github.com/FluxML/Zygote.jl.git`
   Updating git-repo `https://github.com/FluxML/NNlib.jl.git`
  Resolving package versions...
ERROR: Unsatisfiable requirements detected for package NNlib [872c559c]:
 NNlib [872c559c] log:
 ├─possible versions are: 0.7.0 or uninstalled
 ├─restricted to versions 0.6-0.7 by Tracker [9f7883ad], leaving only versions 0.7.0
 │ └─Tracker [9f7883ad] log:
 │   ├─possible versions are: 0.2.7 or uninstalled
 │   └─Tracker [9f7883ad] is fixed to version 0.2.7
 ├─restricted to versions 0.6 by Flux [587475ba] — no versions left
 │ └─Flux [587475ba] log:
 │   ├─possible versions are: 0.11.0 or uninstalled
 │   └─Flux [587475ba] is fixed to version 0.11.0-DEV
 └─NNlib [872c559c] is fixed to version 0.7.0

The FluxML ecosystem is incompatible with itself today and that brings down DiffEq, Pumas, Turing, etc. Until it's figured out, everyone who does ]up will get an update that makes Tracker fail at using, so https://github.com/JuliaLang/Pkg.jl/issues/1251#issuecomment-646982999 is appropriate.

KristofferC commented 4 years ago

@ChrisRackauckas, meant to post that somewhere else? I don't see how it is related.

timholy commented 4 years ago

@ChrisRackauckas, question was more, have you ever had the experience where you have a zoo of pkgs that you've got working locally but then spend the next two days getting all the PRs submitted, through CI, merged, and registered? When really it should all be just one giant PR that goes through CI once?

KristofferC commented 4 years ago

it's more that I'd like to recombine most of JuliaImages into a single git repo

A script to do that doesn't seem to difficult to make. I'm not sure what the package manager needs to do about it though.

timholy commented 4 years ago

What exactly do you mean? A script to submit, auto-merge-if-passes-CI, & submit registration requests for a long queue of interlocked PRs? Or do you mean a script to combine packages into one repo? I wouldn't know how to do the former, and the latter doesn't need a script.

ChrisRackauckas commented 4 years ago

have you ever had the experience where you have a zoo of pkgs that you've got working locally but then spend the next two days getting all the PRs submitted, through CI, merged, and registered? When really it should all be just one giant PR that goes through CI once?

Yes, the current state of Flux+Adapt+Zygote+Tracker should've only been tagged together, and it's right now in a limbo state of going through PRs one by one and until then everything downstream is left with an installation that fails at using because of the random solution to dependencies that Tracker chose (it drops down to a version that isn't compatible with v1.3). So yes, there is a very current example where such a feature should've been used to stop multiple major Julia package ecosystems from shutting down for... 2 days? Your time estimate is about right, since the tags will go through to make it work probably by tomorrow, but it seems weird to have such an issue (that you know is going to happen, everyone knows this is going to happen) but due to time... put in things one by one by one by one and just tell people not to update until it's all done. 🤯

I want that goodness but as @DilumAluthge says I think we need a bit more tooling.

Indeed, there aren't enough features to put things in one repo. Some people have suggested it for DiffEq before, but there was a clear lack of understanding on their part of what that actually means. Right now, it's just infeasible. To do a single repo, we would need:

The main issue is usually due to small dependencies used by everyone feeling like it's a good idea to change, for example things like Requires v1.0. I think we should just informally have an agreement that libraries like Requires should never have a breaking update, and the breaking update should be a new package Requires1, because (hypothetical example) having half of the world requiring Requires 1.0 and the other half requiring Requires <1.0 will leave everything in a delicate dance until 200 separate packages all decide it's a good idea to move. So if anything we need more repos...

The problem is that when the stack is 15 or so repositories (which JuliaImages is), and each in the best-case scenario takes half an hour (15 minutes for submission, CI, merging and 15 minutes for tagging), this process can easily eat up the entire weekend. Worse, there are CI limits on the number of jobs that run under my account, and each merge-to-master re-triggers CI, so as the weekend progresses CI gets increasingly clogged. (GitHub actions will likely make that better---worst is appveyor---but we aren't done migrating yet.) I've definitely had changes that take longer than a weekend to get through this process.

The correct thing to do is to throw away good developer practice in favor of a working one. For example, with Flux madness I know that DiffEqFlux and NeuralNetDiffEq won't be able to actually get any of the updates until DiffEqSensitivity does, so I already merged and tagged all of the compat updates just so that it's easily possible to force DiffEqSensitivity to give me the right versions, which fixes CI and gives people the easy ability to dev/add to a version for testing against Flux master. So usually you can bottleneck things down to the real issue pretty quickly, and merge the other 14 updates immediately, and now CI can work on the one package you need it to. That said, a system that forces you to follow clearly bad developer practice if you want CI, add Package#master, etc. to act sanely isn't a great solution.

What's sad is that all of these situations aren't theoretical. So yes, in conclusion, if we could just upgrade more things in sync by throwing them all into a single PR, that would at least help a lot, but there's still more to change.

KristofferC commented 4 years ago

A script that modifies the registry to update the urls and adds a subdir entry for them. I don't really know what you mean about CI, in the case where you just want to merge existing packages into a single repo. What CI would need to be run? You mean the registry ci?

Also, changing url in the registry will likely break all existing versions on julias pre PkgServer (since the tree hash is not available in the new repo). I think we can give the repo as a vector so that might work? Cc @fredrikekre

timholy commented 4 years ago

Reduced latency times. I generally don't use DifferentialEquations because it has too much latency, too many dependencies

To clarify, this issue was a tracker for various sub-issues&prs. The Pkg community has done something really cool: decouple git repo from package. Proof-of-principle is in https://github.com/timholy/SnoopCompile.jl. Once https://github.com/JuliaRegistries/General/pull/16663 merges, you'll be able to say using SnoopCompileCore without loading SnoopCompile even though the SnoopCompileCore package is in the same git repo.

This solves the latency times issue. The question is, what else (if anything) needs to be done to leverage the ability to shove everything into one repo to make development easier.

KristofferC commented 4 years ago

Yes, the current state of Flux+Adapt+Zygote+Tracker should've only been tagged together, and it's right now in a limbo state of going through PRs one by one and until then everything downstream is left with an installation that fails at using because of the random solution to dependencies that Tracker chose (it drops down to a version that isn't compatible with v1.3).

Why would an Ad library has to be tagged in lock step with Flux? And why would stuff start to fail if only one of these are tagged. This just looks like bad developer practice and has nothing to do with Pkg. Define a proper API and expose it and don't break stuff all the time?

ChrisRackauckas commented 4 years ago

This solves the latency times issue. The question is, what else (if anything) needs to be done to leverage the ability to shove everything into one repo to make development easier.

How do the CI triggers work? And are the dependencies per sub-package?

timholy commented 4 years ago

A script that modifies the registry to update the urls and adds a subdir entry for them.

LocalRegistry made https://github.com/JuliaRegistries/General/pull/16663 easy. It might be ideal to give Registrator the ability to register multiple subdirs, but this is the least of my worries.

I don't really know what you mean about CI, in the case where you just want to merge existing packages into a single repo

While I'd certainly like to make the act of splitting or merging seamless, my biggest concern is whether we've really thought through the ongoing development cycle. As I pointed out in https://github.com/JuliaLang/Pkg.jl/issues/1251#issuecomment-646552688, suppose BigPkg and all of its sub-package dependencies is at v1.8.3. Now suppose I make a breaking change and move everyone to v2.0.0. OK, so the overarching BigPkg Project.toml file (the one that Pkg first looks at) shows this:

version = "2.0.0"

[deps]
SubPkgA = "uuidA"
SubPkgB = "uuidB"

[compat]
SubPkgA = "2"
SubPkgB = "2"

When I submit this for CI, I'm reasonably sure Pkg is going to barf, because it has no idea where it will get v2 of SubPkgA from (the most recently-registered version is 1.8.3). Of course it's being defined in this same PR (SubPkgA/Project.toml also moved to version = "2.0.0" in the same PR), but I think Pkg will throw an error before it gets to the point of realizing that.

EDIT: I don't think I even needed to go to 2.0.0, I think the same thing would happen if some whiz-bang new (but non-breaking) feature were added to SubPkgA and so everyone bumps to 1.9.0.