hexpm / hex

Package manager for the Erlang ecosystem.
https://hex.pm
971 stars 184 forks source link

Add ability to use multiple API endpoints. #46

Closed jcspencer closed 9 years ago

jcspencer commented 10 years ago

Currently, Hex only works with one API/CDN endpoint. This is fine for public users, but for organizations running Hex on their own infrastructure, this can be a pain to have to manually edit.

My proposal is to add a ~/.hex/endpoints.config file in the format:

[default: {"cdn_url", "api_url"}, work: {"cdn_url", "api_url"}]

This could be toggled using mix hex.endpoint use work or mix hex.endpoint switch for a prompt input.

Adding this would allow users to quickly switch between registries and CDNs, allowing them to install packages hosted on privately run Hex instances.

This could also be used for mirrors of the S3 bucket (eg. EU or AU).

Thoughts? :+1: or :-1:?

ericmj commented 10 years ago

We already support this through the environment variables HEX_URL and HEX_CDN.

In the future I want more sophisticated support for multiple registries so that you can use both the main hex.pm and your internal company registry at the same time. But that requires some major changes internally and I haven't decided yet how it should work. It is still quite low on the priority list because no one has said yet that they will use this feature.

I'm sure that people will ask for it in the future so I definitely want to support it and I'm thinking about the best solution.

jcspencer commented 10 years ago

I think adding multiple registries so that you can use both the main hex.pm and your internal company registry at the same time would be fairly simple.

One way it could be tackled is by setting a master and sub repositories. This would work in a way that we have a tree where each repository is at a different layer in the tree. This would mean that if you had work higher in the tree than public, the client would check with the work repository for a package before public. So if your work repo had a private fork of, say, cowboy, it would pull your fork before the public version due to the layout of the tree. This would allow for virtually infinite tree levels.

It would also be wise to add an option to explicitly set the repo to pull from using an atom in the Mix deps, possibly like so:

{:conform, "~> 0.10.0", repo: :work}

Any comments or thoughts?

ericmj commented 10 years ago

Yep, that's a possible solution.

We would also need to write the registry url to the mix.lock file to ensure repeatable builds. But what if the registry is private, is there a chance that we can leak private information if someone accidentally shares their mix.lock file? Maybe it is fine if we just add support for authentication to retrieve the private registry?

I think that using a package outside of the main hex.pm registry should always require repo: :whatever instead of having the tree of registries.

jcspencer commented 10 years ago

I think we should highly recommend locking servers run privately, but ultimately it shouldn't force locking, in the case of mirrors, etc.

Adding to the mix.lock is a great idea, I'm surprised I overlooked it.

Would we add the option to define repository {cdn, api}'s in the mix.exs? This could be a security issue if the Mix file is leaked in a private environment, but in the future, if someone is just using a mirror, this would be fine.

jcspencer commented 10 years ago

Also, I think it would be an idea to define the default repository in the Mix file. In this case,

{:conform, "~> 0.10.0", repo: :hex}

Although I can't personally see a use for it, being able to disable the use of the public :hex repository would be useful in organizations where only internal packages can be used.

If a repo is not defined on the system, we simply throw an error and prematurely exit Mix.

And now that I think about it, if we eventually add the ability to lock the repo, we could use several methods to do this authentication. Be it stored in the Mix file, the ~/.hex/endpoints.config file, or simply prompted for on installation.

ericmj commented 10 years ago

There might be an issue when you have defined a dependency as {:ecto, "~> 0.1.0", repo: :other} and someone else is using your project and has another repo named :other in their hex config. Maybe we should just go with a straight URL in mix.exs.

We need to look at how other package managers, such as npm and rubygems, support multiple repositories.

jcspencer commented 10 years ago

NPM

The only support NPM has for multiple repositories is to run npm --registry http://registry.npmjs.eu/ install expressor by running npm set registry http://abc.xyz where http://abc.xyz is a couchdb replicated from NPM.

Maven

Maven uses the following format in it's pom.xml

<project>
...
  <repositories>
    <repository>
      <id>my-repo1</id>
      <name>your custom repo</name>
      <url>http://jarsm2.dyndns.dk</url>
    </repository>
    <repository>
      <id>my-repo2</id>
      <name>your custom repo</name>
      <url>http://jarsm2.dyndns.dk</url>
    </repository>
  </repositories>
...
</project>

PyPI

PyPI uses this format:

Single Usage: pip install -i http://<mirror>/simple <package>

Global settings: Add ~/.pip/pip.conf that includes:

[global]
index-url = http://<mirror>/simple

RubyGems

RubyGems allows users to set the following in the Gemfile:

source "http://your.servers.ip:9292

Nuget

Nuget has the best support I've seen so far:

To add a repo, we run nuget sources Add -Name Artifactory -Source http://localhost:8081/artifactory/api/nuget/<repository key>. After this is run, the source is added to a list, and packages are pulled from these sources. i believe it is expected that packages have unique identifiers, but I'm not 100% sure.

Conclusion

That's all the examples I could find, and other than NuGet and Maven, most of them don't actually do the same as this proposal, as they only allow one mirror/repo at a time!

jcspencer commented 10 years ago

Possibly we could have global repos that are overridden in the Mixfile? Or if we're on a system where we're distributing to brand new clients that dont have the alias, we could substitute a named atom alias for a tuple with the api url and the cdn url.

ericmj commented 10 years ago

Thanks for the great compilation of the package managers :heart:.

So it seems like our proposal isn't too crazy. I would go with just a literal repo in the mixfile. Example: {:ecto, "0.1.2", repo: "http://path/to/registry.ets.gz"}.

To publish and create accounts and so on you can do HEX_URL=... mix hex.publish. We can tag the accounts by url in hex.config so you can have multiple accounts for different servers.

jcspencer commented 10 years ago

I also think the ability to globally assign registries to atoms would be nice, such as :es or :eu

josevalim commented 10 years ago

Very nice! If we are allowing atoms, I believe they should be a key in the project config instead of something global. But we probably don't need atoms from the start. I agree we likely need to store them in the lock file too.

josevalim commented 10 years ago

Btw, this is relevant to this discussion: https://groups.google.com/forum/#!topic/ruby-security-ann/Rms5sZhLxdo (i.e. how to handle ambiguity if we go with something like Maven or Bundler).

ericmj commented 10 years ago

Interesting, thanks for the link @josevalim.

Atoms are unnecessary if we are going to store it in the mixfile anyway:

@company_repo "http://company.com/registry.ets.gz" 
defp deps, do: {:ecto, "0.1.2", repo: @company_repo}
jcspencer commented 10 years ago

Will this registry.ets.gz be pulled down on each build, or cached and only updated when changed?

ericmj commented 10 years ago

Like today, it will be cached and only updated when changed.

jcspencer commented 10 years ago

I have a couple queries about this:

  1. Just for clarification, will this be a global cache, or a project independent cache?
  2. If one or more packages sets repo:, should we force users to set it on each package to specifically request where packages are pulled from?
  3. Looking at the link @josevalim sent, could we implement a DSL to automatically set the repo: field of packages in a block, such as
  defp deps do
    repo @company_repo do
        [{:ecto, "~> 0.3.0"},
         {:plug, github: "elixir-lang/plug"}]
    end
  end

which would set the repo: field for each package in the set to @company_repo

josevalim commented 10 years ago
  1. The registry are globals.
  2. I think it is fine to assume the default package is Hex default (so the answer is no)
  3. No DSL for now (we can discuss it if it ever becomes an issue)
jcspencer commented 10 years ago

After some further thought and consideration, I've come up with this proposal. Feedback would be appreciated. //cc @ericmj @josevalim

To pull a specific package, we can set the repo: field with a tuple that contains two fields:

@store "https://cdn.hex.company.com/"
@api "https://hex.company.com/api/"
@company_repo {@store, @api}

# note: we need the base url of the store (not the registry.ets.gz) due to the fact that
# the registry doesn't contain the package url

# This can be used like so:
defp deps, do: {:ecto, "0.1.2", repo: @company_repo}

Once one or more packages have their repo: field set, all packages must explicitly define their repo:. (Or else, we just warn the user not all packages have their repo: field set, as it can get confusing having dependencies that are marked with a repo: field and others without).

Finally, it would be a good idea to check that the supplied CDN and API URLs are accessible prior to staring dependency resolution, as this may cause issues when installing dependencies.

The only thing I'm not sure about is if we should allow packages to be pushed to the public repo that include packages from private repositories?

ericmj commented 10 years ago

It is fine to always fallback to hex.pm when :repo is not set. I don't there is any need for the API URL there. You are not allowed to have packages with dependencies from another repository.

jcspencer commented 10 years ago

Come to think of it, we won't need to API URL to be included. Should we just throw an error and kill the upload of users attempt to upload a package that has dependencies from another repository?

josevalim commented 9 years ago

Just to add what I have discussed with @ericmj. Registries must be self-contained: i.e. a registry can only depend on packages it contains. This may be too restrictive at first but it is always a good way to start with the most restrictive design and then expand on it.

That said, once we specify the :repo option for a dependency, all dependencies of that dependency must come from the same :repo. We should also store the :repo in the lock file and use it when checking if two hex packages are equal. I.e. if two hex packages come from different repos, they should diverge and the user then needs to resolve the divergence in their mix.exs file.

lucacorti commented 8 years ago

Is this supported now, or being considered for the future? No private repository support is an issue in corporate environments where you have lots of internal software releases.

ericmj commented 8 years ago

I am working on support for private packages in hex.pm. You can use private repositories right now, but you also have to mirror hex.pm also if you want to use packages from hex.pm as well as internal packages.

I am interested in supporting multiple repositories but it's not currently on the roadmap.

colinrymer commented 7 years ago

That said, once we specify the :repo option for a dependency, all dependencies of that dependency must come from the same :repo.

Why must this be the case? If I'm building a private package that depends on Ecto, for example, this would mean I have to duplicate Ecto into my registry and ensure I update it as Ecto updates.

ericmj commented 7 years ago

Why must this be the case? If I'm building a private package that depends on Ecto, for example, this would mean I have to duplicate Ecto into my registry and ensure I update it as Ecto updates.

That comment was from over two years ago :). The :repo option is not live yet but when it is you can use it to set where the dependency should be sourced from. The only restriction is for the hex.pm repository, for it you can only depend on other packages from hex.pm.

mjonas87 commented 6 years ago

@ericmj Where can I track the status of the :repo option? Definitely would like to know when that goes live. Thanks!

mjonas87 commented 6 years ago

Nvm...found it!

https://github.com/hexpm/hex/commit/b05096efee9783b6c6682244e18823dfbff81c9a https://github.com/hexpm/hex/releases/tag/v0.17.1