Closed jcspencer closed 9 years ago
We already support this through the environment variables HEX_URL
and HEX_CDN
.
In the future I want more sophisticated support for multiple registries so that you can use both the main hex.pm and your internal company registry at the same time. But that requires some major changes internally and I haven't decided yet how it should work. It is still quite low on the priority list because no one has said yet that they will use this feature.
I'm sure that people will ask for it in the future so I definitely want to support it and I'm thinking about the best solution.
I think adding multiple registries so that you can use both the main hex.pm and your internal company registry at the same time would be fairly simple.
One way it could be tackled is by setting a master and sub repositories. This would work in a way that we have a tree where each repository is at a different layer in the tree. This would mean that if you had work
higher in the tree than public
, the client would check with the work
repository for a package before public
. So if your work
repo had a private fork of, say, cowboy
, it would pull your fork before the public
version due to the layout of the tree. This would allow for virtually infinite tree levels.
It would also be wise to add an option to explicitly set the repo
to pull from using an atom in the Mix deps
, possibly like so:
{:conform, "~> 0.10.0", repo: :work}
Any comments or thoughts?
Yep, that's a possible solution.
We would also need to write the registry url to the mix.lock
file to ensure repeatable builds. But what if the registry is private, is there a chance that we can leak private information if someone accidentally shares their mix.lock file? Maybe it is fine if we just add support for authentication to retrieve the private registry?
I think that using a package outside of the main hex.pm registry should always require repo: :whatever
instead of having the tree of registries.
I think we should highly recommend locking servers run privately, but ultimately it shouldn't force locking, in the case of mirrors, etc.
Adding to the mix.lock
is a great idea, I'm surprised I overlooked it.
Would we add the option to define repository {cdn, api}
's in the mix.exs
?
This could be a security issue if the Mix file is leaked in a private environment, but in the future, if someone is just using a mirror, this would be fine.
Also, I think it would be an idea to define the default repository in the Mix file. In this case,
{:conform, "~> 0.10.0", repo: :hex}
Although I can't personally see a use for it, being able to disable the use of the public :hex
repository would be useful in organizations where only internal packages can be used.
If a repo is not defined on the system, we simply throw an error and prematurely exit Mix.
And now that I think about it, if we eventually add the ability to lock the repo, we could use several methods to do this authentication. Be it stored in the Mix file, the ~/.hex/endpoints.config
file, or simply prompted for on installation.
There might be an issue when you have defined a dependency as {:ecto, "~> 0.1.0", repo: :other}
and someone else is using your project and has another repo named :other
in their hex config. Maybe we should just go with a straight URL in mix.exs
.
We need to look at how other package managers, such as npm and rubygems, support multiple repositories.
The only support NPM has for multiple repositories is to run npm --registry http://registry.npmjs.eu/ install express
or by running npm set registry http://abc.xyz
where http://abc.xyz
is a couchdb replicated from NPM.
Maven uses the following format in it's pom.xml
<project>
...
<repositories>
<repository>
<id>my-repo1</id>
<name>your custom repo</name>
<url>http://jarsm2.dyndns.dk</url>
</repository>
<repository>
<id>my-repo2</id>
<name>your custom repo</name>
<url>http://jarsm2.dyndns.dk</url>
</repository>
</repositories>
...
</project>
PyPI uses this format:
Single Usage: pip install -i http://<mirror>/simple <package>
Global settings: Add ~/.pip/pip.conf
that includes:
[global]
index-url = http://<mirror>/simple
RubyGems allows users to set the following in the Gemfile:
source "http://your.servers.ip:9292
Nuget has the best support I've seen so far:
To add a repo, we run nuget sources Add -Name Artifactory -Source http://localhost:8081/artifactory/api/nuget/<repository key>
. After this is run, the source is added to a list, and packages are pulled from these sources. i believe it is expected that packages have unique identifiers, but I'm not 100% sure.
That's all the examples I could find, and other than NuGet and Maven, most of them don't actually do the same as this proposal, as they only allow one mirror/repo at a time!
Possibly we could have global repos that are overridden in the Mixfile? Or if we're on a system where we're distributing to brand new clients that dont have the alias, we could substitute a named atom alias for a tuple with the api url and the cdn url.
Thanks for the great compilation of the package managers :heart:.
So it seems like our proposal isn't too crazy. I would go with just a literal repo in the mixfile. Example: {:ecto, "0.1.2", repo: "http://path/to/registry.ets.gz"}
.
To publish and create accounts and so on you can do HEX_URL=... mix hex.publish
. We can tag the accounts by url in hex.config
so you can have multiple accounts for different servers.
I also think the ability to globally assign registries to atoms would be nice, such as :es
or :eu
Very nice! If we are allowing atoms, I believe they should be a key in the project config instead of something global. But we probably don't need atoms from the start. I agree we likely need to store them in the lock file too.
Btw, this is relevant to this discussion: https://groups.google.com/forum/#!topic/ruby-security-ann/Rms5sZhLxdo (i.e. how to handle ambiguity if we go with something like Maven or Bundler).
Interesting, thanks for the link @josevalim.
Atoms are unnecessary if we are going to store it in the mixfile anyway:
@company_repo "http://company.com/registry.ets.gz"
defp deps, do: {:ecto, "0.1.2", repo: @company_repo}
Will this registry.ets.gz
be pulled down on each build, or cached and only updated when changed?
Like today, it will be cached and only updated when changed.
I have a couple queries about this:
repo:
, should we force users to set it on each package to specifically request where packages are pulled from?repo:
field of packages in a block, such as defp deps do
repo @company_repo do
[{:ecto, "~> 0.3.0"},
{:plug, github: "elixir-lang/plug"}]
end
end
which would set the repo:
field for each package in the set to @company_repo
After some further thought and consideration, I've come up with this proposal. Feedback would be appreciated. //cc @ericmj @josevalim
To pull a specific package, we can set the repo:
field with a tuple that contains two fields:
@store "https://cdn.hex.company.com/"
@api "https://hex.company.com/api/"
@company_repo {@store, @api}
# note: we need the base url of the store (not the registry.ets.gz) due to the fact that
# the registry doesn't contain the package url
# This can be used like so:
defp deps, do: {:ecto, "0.1.2", repo: @company_repo}
Once one or more packages have their repo:
field set, all packages must explicitly define their repo:
. (Or else, we just warn the user not all packages have their repo:
field set, as it can get confusing having dependencies that are marked with a repo:
field and others without).
Finally, it would be a good idea to check that the supplied CDN and API URLs are accessible prior to staring dependency resolution, as this may cause issues when installing dependencies.
The only thing I'm not sure about is if we should allow packages to be pushed to the public repo that include packages from private repositories?
It is fine to always fallback to hex.pm when :repo
is not set. I don't there is any need for the API URL there. You are not allowed to have packages with dependencies from another repository.
Come to think of it, we won't need to API URL to be included. Should we just throw an error and kill the upload of users attempt to upload a package that has dependencies from another repository?
Just to add what I have discussed with @ericmj. Registries must be self-contained: i.e. a registry can only depend on packages it contains. This may be too restrictive at first but it is always a good way to start with the most restrictive design and then expand on it.
That said, once we specify the :repo option for a dependency, all dependencies of that dependency must come from the same :repo. We should also store the :repo in the lock file and use it when checking if two hex packages are equal. I.e. if two hex packages come from different repos, they should diverge and the user then needs to resolve the divergence in their mix.exs file.
Is this supported now, or being considered for the future? No private repository support is an issue in corporate environments where you have lots of internal software releases.
I am working on support for private packages in hex.pm. You can use private repositories right now, but you also have to mirror hex.pm also if you want to use packages from hex.pm as well as internal packages.
I am interested in supporting multiple repositories but it's not currently on the roadmap.
That said, once we specify the :repo option for a dependency, all dependencies of that dependency must come from the same :repo.
Why must this be the case? If I'm building a private package that depends on Ecto, for example, this would mean I have to duplicate Ecto into my registry and ensure I update it as Ecto updates.
Why must this be the case? If I'm building a private package that depends on Ecto, for example, this would mean I have to duplicate Ecto into my registry and ensure I update it as Ecto updates.
That comment was from over two years ago :). The :repo
option is not live yet but when it is you can use it to set where the dependency should be sourced from. The only restriction is for the hex.pm repository, for it you can only depend on other packages from hex.pm.
@ericmj Where can I track the status of the :repo
option? Definitely would like to know when that goes live. Thanks!
Currently, Hex only works with one API/CDN endpoint. This is fine for public users, but for organizations running Hex on their own infrastructure, this can be a pain to have to manually edit.
My proposal is to add a
~/.hex/endpoints.config
file in the format:This could be toggled using
mix hex.endpoint use work
ormix hex.endpoint switch
for a prompt input.Adding this would allow users to quickly switch between registries and CDNs, allowing them to install packages hosted on privately run Hex instances.
This could also be used for mirrors of the S3 bucket (eg. EU or AU).
Thoughts? :+1: or :-1:?