Open arcanis opened 5 years ago
For some extra context, how we handled this with CocoaPods was to extend the equivalent of the package.json
with a new attribute for source
:
{
"sources": [
"gh",
"npm"
],
"dependencies": {
"danger/danger-js": "^1.2.3", // From GH
"metro": "^3.2.1". // From NPM
}
}
Where the order is important. An non-existent copy of "sources"
is just ["npm"]
, making it opt in for different behavior.
CocoaPod has different constraints (all lib definitions are fs access, and so lookups are cheap) but maybe it could spark an idea.
One of the things I currently like about node.js is that when I read the code, I check the import section const foo = require('@zkochan/foo')
and I know that this package is @zkochan/foo
on npm. I think this is especially valuable during code reviews. I guess, there is no way to preserve this?
Regarding the protocol. There is already a github:
protocol but only for git-hosted packages. For instance, github:kevva/is-positive
.
I think this is especially valuable during code reviews. I guess, there is no way to preserve this?
It's already not always the case - it can come from npm, but also be a workspace, or a git dependency, or come from the file:
protocol, or be a peer dependency (in which case all bets are off).
Regarding the protocol. There is already a
github:
protocol but only for git-hosted packages. For instance,github:kevva/is-positive
.
Yep that's why I mentioned npm+gh:
(npm protocol via github). Repurposing github:
is a possibility I guess, but it would be breaking unless we try to support both Git and the GitHub registry with the same protocol. Maybe too much work when an extra protocol can do the job π€
For some extra context, how we handled this with CocoaPods was to extend the equivalent of the
package.json
with a new attribute forsource
:
Interesting - I think the problem with this approach is that the packages can't be resolved anymore unless you know what's the source
field. For example in the case of the resolutions
field we wouldn't have the source
parameter (or we could reuse the one from the top-level package, but that might be fairly confusing).
It also might cause troubles with third-party tools (for example npm) that wouldn't be aware of the source
field and would resolve from npm instead of GitHub - which could lead to new attack vectors.
Oh something I just remembered: we're working on "Zero-Installs" for Yarn v2 (more infos here). I remember seeing someone mentioning that "master will always be available as a package". If that's literally what happens (ALL of the repository is downloaded through the "master" package) it might be a bit problematic for us since the repository might then contain all the zip archives for the dependencies.
I suspect the @github folks have thought about this (maybe not for zip archives, but at least for other typically-useless-package-files like the tests or the documentation), so I'm curious to hear from @clarkbw or anyone else if there's a mechanism to filter the master package list. Maybe an npmignore or files
field support?
Lots to talk through here. And I want to bring in @phanatic to the conversation as well.
Iβm encouraging people to look at how they can publish in parallel at this moment, itβs too early in our lifetime for any complete switch over. I want your feedback on what we have now and what an ideal future would be like.
The other piece I think is important to consider is that we intend to open source the server components. We are doing this because we want you to be able to balance client and server complexity. Most frameworks have simple servers and complex clients to do the heavy lifting. An open source server that we share with you means we can build a better solution that doesnβt mean working around a limited static sever component. The proxy of packages is a good example, I want you to assume we build a Yarn server together such that you could default to yarn packages and namespaces but have the server proxy npm as needed and likely notify the client which ones were proxy packages.
@orta this βοΈ goes for you all (CP) as well, please reach out.
The registry has a base layer object and meta data storage with GraphQL APIs which the sever components use store file objects. Server components handle the URL endpoints and client API translation to that GraphQL layer. A future yarn server could be a lightweight wrapper around the GraphQL APIs.
@arcanis Thanks for raising this. This was one of the first things I thought of when the GitHub announcement happened and I wrote up my thoughts / ideas here: https://gist.github.com/MarshallOfSound/7101ff77c5f981e01362985935790633
I'll summarise them below, though I'd recommend reading through my ramblings in full π :
package.json
or a in-place .npmrc
or similar strategy) to define which registry to fetch which package from
known_hosts
file works). NPMs and GitHubs registries may be trusted by default.yarn
can solve this, every other package manager needs to solve the same problem. Whatever solution is reached here should mutually be defined and implemented in npm
as well. My biggest concern here is scope-sniping and the security of the node module ecosystem so this needs to work for all package managers.The proxy of packages is a good example, I want you to assume we build a Yarn server together such that you could default to yarn packages and namespaces but have the server proxy npm as needed and likely notify the client which ones were proxy packages.
So in summary you would see the GitHub registry as a platform on top of which multiple other registries could potentially be built? So in a sense, the GitHub registry would be one "universe" amongst others (with a "universe" being a set of published packages)? We could make that through a smart protocol:
"@yarnpkg/cli": "universe(gh): ^1.2.3",
The package manager would then look in its configuration to figure out what's the configured registry for the gh
universe, and use that (or abort the install if the gh
universe isn't configured). It would work with various use cases:
Project maintainers would be able to configure their universes to any proxy they'd like.
Package authors would be able to depend on a package from another universe regardless where they host their own packages
Yarn would be able to preconfigure the gh
universe to point to GitHub, although users would be able to reconfigure the universe to point to a different registry if they need to.
This proposal also has some issues, but they have their own solutions:
It wouldn't be possible to depend on two similarly-named packages from two different universes. If we wanted to do that, the universe would need to be baked within the package name (so instead of @yarnpkg/cli": "universe(gh): ^1.2.3
, we would instead have something like @yarnpkg:gh/cli": "^1.2.3
). This would prevent two universes from having conflicting package names. At the same time, it might be difficult to do that if people start to rely on the universe-less package names, so time would be of the essence π€
There's a risk that package authors will want to create "universe aliases" just for the sake of having their own universe. This could lead to a bad DevX situation where we would have to configure a bunch of universes in order to use a project. We could solve this by disallowing two universes from being configured to the same registry.
For security reasons unrecognised registries defined at a module level should be treated as untrusted unless trusted by the user. (Think how ssh's known_hosts file works).
In the universe concept I mentioned the hosts would be defined in a yarnrc file, so they would never come from the third-party packages themselves.
Whatever solution is reached here should mutually be defined and implemented in npm as well.
It's hard to predict what npm will do (their whole cli team resigned, and the last commits were early March - it's not even clear who from their company should be brought into the discussion).
Given the current state of things, I'm kinda assuming it will be left in a limbo state until proven otherwise; we probably should plan for that. Ping @ahmadnassri who might have some new insight to share π
Warning: total n00b speaking here, with only user-level experience
Dreaming slightly, if we weren't bound by backwards compatibility or other companies in any way, a potential solution would be to make all package names URLs that identify their registry/source/universe, similar to what Go does. For example, packages on the npm registry would be npmjs.com/foo-bar
and packages on GitHub's registry might be github.com/markspolakovs/baz-lib/baz
, while an internal registry might be npm.corp.mycompany.com/quux
. Forces users to be explicit about what registry they're using, as well as making it unambiguous. Also does away with the need for a quasi-known_hosts
.
For reference, the way Go transforms the "pretty URL" into a package resolution is via a HTML meta tag - for example github.com/yarn/berry
contains <meta name="go-import" content="github.com/yarnpkg/berry git https://github.com/yarnpkg/berry.git">
. So npmjs.com/foo-bar
would have a similar pointer to the package.json
.
Obviously, this is a massive backward compat break, and, while 99% of packages are still on npm, the UX isn't pretty. Perhaps a bare foo
would be aliased to npmjs.com/foo
?
GitHub now proxies to npm registry if the package doesn't exist in GPR: https://github.blog/2019-09-11-proxying-packages-with-github-package-registry-and-other-updates/
@jgierer12 this is currently limited to your organization's npm packages, not all npm packages.
FYI: NPM is joining Github so yea π¦
Github packages might be a possibility
What package is covered by this investigations?
The GitHub Package Registry
Describe the goal of the investigation
To figure out what should our actions be going forward. Find out how to provide a safe and sound user experience that protects against name squatting.
Should we move to the GitHub Package Registry as default registry?
I've seen this question here and there, so we probably should discuss it.
My opinion is: I don't think we need to change the default registry anytime soon, unless something changes dramatically on the npm side. There are three reasons why I think we should wait:
The GitHub registry doesn't mirror the packages from npm afaik. So it's not a replacement for the traditional registry.
The GitHub registry uses scopes to define which set of packages belond to it. Those scopes unfortunately conflict with the ones on npm, so switching the default would put users at risk (they would expect to download from npm, but would instead get the GitHub versions).
I tend to have a "wait and see" policy for this kind of large-scale change. Even once we'll have figured out a way to counterbalance the two first points, we will want to make sure the GitHub registry scales properly before enabling it for everyone.
First-class support
Something we need to consider is: should the GitHub registry be one registry amongst many (in the sense that it would piggy-back on the
npm:
protocol), or have a first-class support (with a specific protocol, likenpm+gh:
)?The first case will likely cause developer experience issues (how to depend on a GitHub package from an npm package?), the second doesn't scale very well if we need to do that for all the registries.
My perception is that we need to follow intent. For all purposes, our users will likely choose to depend on a package from one of the two sets of registry: npm or GitHub. Other registries will, I believe, merely be either 1/ mirrors of the first two, or 2/ private npm instances with specific workflows (which will be able to safely enforce the registry configuration for a given scope, for example).
In this light I'd be in favor of
npm+gh:
being a supported protocol (rather than just configuring the registry hostname in the settings). It wouldn't so much define the target hostname, but rather the set of packages we're expected to download.Use a specific package from the GitHub registry instead of npm
This would become possible with the
resolutions
field:Possible action points (please discuss)
Implement a new
npm+gh:
protocol that would inherit from the npm registry, but would instead target the GitHub registry (probably configurable the same way as for the regular npm configuration).Deprecate pure semver dependencies (without protocols). Yarn v2 already supports
npm:^x.y.z
. Npm doesn't, but we can solve it without doing changes on their side: pure semver dependencies listed in npm packages can be defined as implicitly using thenpm:
protocol, while pure semver dependencies listed in GitHub packages can be defined as implicitly using thenpm+gh:
protocol.We would probably need to extend the
resolutions
field in order to be able to change the protocol but not the range. So something like this would become possible, which would enforce Yarn to query the packagefoo
from GitHub without modifying its semver range:Should we implement a
yarn gh publish
command (via a newplugin-github-cli
plugin?) that would always send the package to the Github registry? It might be duplicate withyarn npm publish
π€Paging @yarnpkg/berry, @bnb, @zkochan, @clarkbw for feedback (anyone else from @GitHub interested?)