rubygems / rfcs

RubyGems + Bundler RFCs
44 stars 40 forks source link

Scoped gems proposal #40

Open mullermp opened 2 years ago

mullermp commented 2 years ago

rendered proposal

Hello RubyGems team, and all those that come across this.

In this PR, I have included a proposal for a feature called "scoped gems". In short, the proposal is to widen the gem naming specification to include a new character @ to group related gems together under a specific organization reserved suffix. The naming pattern follows gem_name@scope. On the first gem push (new record), if the gem is scoped (follows the pattern), the gem's scope will be validated to have been created by a user from an organization that reserved the scope. A scoped gem can be installed and required as normal gems are today.

For example, consider aws-sdk-s3, the S3 gem for AWS. If this gem were scoped, it could be published as s3@aws-sdk (or more generically, <service>@aws-sdk). This gem can only be created by a user in the AWS organization (which has reserved the aws-sdk scope). A user can install this gem with gem install s3@aws-sdk and require it with require 's3@aws-sdk'.

The main benefits of this feature are that organizations can publish their own groups of gems (i.e. multiple organizations can have a "configuration" gem), and organizations are able to reserve gem names (via @scope suffix, similar to a reserved prefix). A developer can be reasonably sure that any new gem such as new-cool-feature@rails is an official Rails gem, or new-s3-service@aws-sdk is an official AWS SDK client, or even socket@ruby to be an official stdlib gem! This reservation system combats "fake" "similarly named" gems that are branded as official that attempt to steal personal information.

Please leave any feedback and I would be happy to amend the approach/design.

zarqman commented 2 years ago

I agree with @halostatue that scopes should be part of the name, not a separate scope field on the spec. If there are two gems with the same name, but different scopes, a separate field creates some of the same ambiguities as organizing scoped gems into subdirectories. The only way to avoid this is to join scope+name together in nearly all usage, and if that's the case, it seems better to treat them as one from the start.

Potential places for naming collisions (with . as the example separator): Gem.loaded_specs['async.http'] = ... ~/async.http.gemspec GEM_HOME/cache/async.http-1.2.3.gem GEM_HOME/gems/async.http-1.2.3/ GEM_HOME/specifications/async.http-1.2.3.gemspec GEM_HOME/extensions/x86_64/3.0.0/async.http-1.2.3/ (if applicable) Gem entrypoint: lib/async.http.rb

In all cases, if async. is missing, async.http cannot be differentiated from faraday.http, etc. since they'd all become simply http.

andrewhavens commented 2 years ago

I think scoped gems is an interesting idea, but I'm not sure if there is a way to support it without changing the Ruby language itself. In other languages, you can import different packages with the same name and scope them within the file you are working in. In Ruby, gems are essentially globally namespaced.

The problem that was raised in this proposal about wanting to use a forked version is already achieved through the use of bundler:

# Gemfile
gem 'mail', github: 'rails/mail'

This makes it clear that we are using a forked version of a gem. Thus all gems that have a dependency on mail will be forced to use this specific fork.

If this definition were pushed down to the gemspec level, this would make things very complicated, and even dangerous. Let's say Rails wants to depend on a forked version of a gem using something like gem.dependency '@rails/mail', but what happens when another gem also has a dependency on the same gem? Does Rails get to decide that it has priority simply because it specified a specific username/org? This would open up the possibility of a gem specifying a malicious version as a dependency that takes precedence over the normal version.

So, I think this is already achieved in a reasonable way using Bundler. Might be nice to have an easy way to be able to download a gem from GitHub without having to clone it. Like gem install @rails/mail but that seems like a separate issue. I agree though that @username/gemname should be the format since it is the most intuitive.

ioquatix commented 2 years ago

I think the value as a gem maintainer I see in scopes is two things:

I've run into both of the above problems. Both of them are about predictability and risk management.

The biggest problem I see is:

My feeling is, organisations or scopes should not change the name resolution process we already have, but instead provide a better more secure way for users to procure gems.

If we are thinking big picture, I'd suggest:

source "https://rubygems.org/@rack` do
  gem "rack"
end

source "https://rubygems.org/@rails` do
  gem "rails" # -> depends on "rack" which is satisfied only by the current listed sources, e.g. [@rack, @rails]
end

source "https://rubygems.org" # general global index

gem "rando-whatever" # can pull in from [@rack, @rails, global index] in that order.

The good thing about this model is it allows you to fork a gem (as rails did with mail) and plug it in as a named dependency without breaking dependency resolution (because you'd need a different name to push it to rubygems.org).

This design don't require any changes to name handling and I don't think we should change the name handling because it will break every system that depends on name-based dependency resolution etc.

Based on my above suggestions, it would not be possible to install both ~ioquatix/async and @socketry/async and that's by design because it's super confusing and I don't think scopes should be involved in final name resolution, but they are more of a feature of how to organise dependency management and gem fetch/installation.

ghost commented 2 years ago

^ So, basically, hooray to more confusion just so you could scratch your ego. K.I.S.S.? Never heard of it.

indirect commented 2 years ago

@andy-tycho that kind of "feedback" isn't okay here. consider yourself warned—next comment like that gets you blocked from the RubyGems org for at least a week.

indirect commented 2 years ago

I don't think we should provide a scope mechanism that allows differently-named gems to provide the same global Ruby constants. I survived GitHub's original gem server, and I still have scars from trying to use an app whose gems depended on both tenderlove-nokogiri and nokogiri, which both claimed the constant Nokogiri. Bundler also can't help in that situation, because the gems have different names, different versions, and different dependency trees. In my opinion, this RFC needs a clear solution to that problem to move forward.

To me, @ioquatix's proposal to treat orgs as additional gem sources sounds like the most likely to work under the constraints we have today. For example, gems that depend on mail will continue to work whether mail comes from the global source or the @rails source, and Bundler can ensure there is only one gem named mail claiming the Mail constant.

bkuhlmann commented 2 years ago

Hey everyone. :wave:

After reading through the RFC and this discussion, I want to add some thoughts/observations in hopes that this enriches the discussion (although I might be somewhat counter to André's concerns above -- maybe because I'm missing context to earlier days with GiHub's original gem server):

Gem Specification Scopes
In terms of gem specifications, I want to focus specifically on the use of `scope` -- as described in the [RFC](https://github.com/mullermp/rfcs/blob/master/text/0008-scoped-gems.md). I'd like to emphasize the importance of this within the gem specification as a new field: ``` ruby Gem::Specification.new do |spec| spec.scope = "dry" # This is important for many reasons which I'll highlight shortly. spec.name = "monads" # Truncated for brevity. end ``` As the author and maintainer of [Gemsmith](https://www.alchemists.io/projects/gemsmith) -- a gem for building gems -- this would allow organizations and individual contributors to configure this information once via Gemsmith's [XDG](https://www.alchemists.io/projects/xdg) configuration. This equates to being able to build a gem as follows: ``` bash # Uses global scope as exists today or pulls scope from XDG configuration (if configured). # This is a nice productivity boost when building multiple gems within the same scope. gemsmith --build demo # Uses custom scope which overrides any XDG configured local or global scope. # Definitely tedious when creating multiple gems within the same scope -- if not using an XDG configuration -- but handy for one-time overrides. gemsmith --build monads --scope dry ``` This also means that Gemsmith -- and Bundler -- wouldn't have to add special logic for parsing a gem name -- at creation -- by splitting `dry@monads` into `dry (gem scope)` and `monads (gem name)`. Even better, we improve the developer experience for creating new gems by not forcing someone to have to type this: ``` bash gemsmith --build dry@monads ```
At (@) Symbol Avoidance
Building upon what I've demonstrated above, I'd like to push for avoiding the use of the at symbol (@) within the gem name, package, and URL altogether for the following reasons: - Use of `@` is backwards, awkward, and not intuitive which many have pointed out already. - Use of `@` is better but -- as **Dan** pointed out earlier -- feels more like an email address which confuses me as well. - As **Maciej** mentioned earlier, use of `@` is not entirely compatible with the [Package URL Specification](https://github.com/package-url/purl-spec) and would best be reserved for version information. - Use of `@` in the pathname also feels awkward and non-intuitive to me. Example: `$HOME//3.1.2/lib/ruby/gems/3.1.0/gems/dry@monads-1.4.0`. - Use of `@` in the URL doesn't make sense either. Example: `https://rubygems.org/gems/dry@monads` I'd like to suggest as an alternative which is to keep scope information in the gem specification. Then both Bundler and RubyGems would be able to do the following: **Paths (gem installation and management)** ``` bash # Global scope as exists today. $HOME//3.1.2/lib/ruby/gems/3.1.0/gems/dry-monads-1.4.0 # Scoped as being proposed. # NOTE: `@` is removed in favor of using `dry` as a scoped directory structure. $HOME//3.1.2/lib/ruby/gems/3.1.0/gems/dry/monads-1.4.0 ``` :warning: There are definitely complications with this approach that I'm glossing over as **Thomas** has detailed [here](https://github.com/rubygems/rfcs/pull/40#issuecomment-1119921834) but I think they are surmountable. **URLs (gem lookup)** ``` # Global scope as exists today. https://rubygems.org/gems/dry-monads # Scoped as being proposed. # NOTE: `@` is removed in favor of using `dry` as a scoped directory structure. https://rubygems.org/gems/dry/monads ``` ℹ️ In all of the above use cases -- and as emphasized in the RFC -- the gem namespace would remain the same regardless of using global or specialized scope. Example: ``` ruby module Dry module Monads end end ``` The only difference is how Bundler finds and resolves the gem locally (i.e. either using the scope if defined or falling back to global if not) and how RubyGems lists the gem in the URL (which also depends upon the gem specification).
Graceful Degradation, Soft Forking, and Migration
So far everything I've been proposing allows for graceful degradation, soft forking, and gem transition support. By this, I mean gems can exist as they are today with support for scoped coexistence while falling back to the existing and established format. To summarize: **Gem Specification** ``` ruby # Valid Gem::Specification.new do |spec| spec.name = "dry-monads" # Truncated for brevity. end # Valid Gem::Specification.new do |spec| spec.scope = "dry" spec.name = "monads" # Truncated for brevity. end ``` **Paths** ``` bash # Valid (global) $HOME//3.1.2/lib/ruby/gems/3.1.0/gems/dry-monads-1.4.0 # Valid (scoped) $HOME//3.1.2/lib/ruby/gems/3.1.0/gems/dry/monads-1.4.0 ``` :warning: Keep in mind two formats of the same gem version *would not be allowed*. I'm only showing the same version for path comparison purposes. **URLs** ``` # Valid (global) https://rubygems.org/gems/dry-monads # Valid (scoped) https://rubygems.org/gems/dry/monads ``` All of this means that you can do the following: - Gracefully degrade to global scope if a custom scope isn't provided. - Allow gems to be soft forked by using `my_scope/monads` as a temporary quick fix while the main gem catches up. - Allow existing gems to migrate to the new scoped format by releasing a new version which adds the `scope` to their gemspec.

None of what I've written above addresses the name squatting problem, though. That is still a complication which has been mentioned in this discussion but probably warrants a different proposal.

indirect commented 2 years ago

@bkuhlmann I think your suggestion is aligned on the end goals: scopes need a way to avoid global namespace conflicts. 👍🏻 I might have missed it, but l didn’t see anything in your post to address “scoped forks”, like the Rails org creating their own Mail gem that is an alternative/replacement for the global Mail gem. How would you handle that?

ghost commented 2 years ago

l didn’t see anything in your post to address “scoped forks”, like the Rails org creating their own Mail gem that is an alternative/replacement for the global Mail gem. How would you handle that?

Does it need handling, though? Is this a real issue? Or just a mild annoyance that does not actually happen more than once a decade but still just sits there at the back of the mind periodically nagging "this just does not look good"? How many developers, at a scale, at this very moment see this as a real problem that prevents them from working productively or seriously affect their everyday development? A dozen? A handful? Thousands and thousands of gem authors and gem users don't seem to see it as an issue. Which brings it into a category of "for some developers, this would be nice to have, but is not something the community can't live without".

My point in all of this is: if something adds new useful functionality – please do go for it. As long as it does not change or demand changes to existing functionality or process. The discussed subject does not look like something that is really worth nor justifies a change in the established process. If you want to add something that would make a handful of ruby developers out there feel happier, yay. As long as the rest of the community can keep on working and living without even needing to know about this change. Nay. Do think twice^10 before accepting a change that would require thousands to alter their processes.


I must as well note that I am fairly surprised that such a discussion happens without @matz in sight. And @dhh. And @ko1. And @tenderlove, @hone, @amatsuda, @tmm1, @kobaltz, and all the other prominent locomotives and shapers of the Ruby world. People whose opinion does matter. I believe that these folks have, too, encountered such a question at some point in their work, and surely have an opinion on the subject. It would just be splendid to hear those as well.

indirect commented 2 years ago

@andy-tycho Baseless rhetorical claims that the problem is rare (which are empirically false, by the way) is not the kind of feedback we are looking for. An entire GitHub account with a fake profile picture just so you can post contentless angry negative comments about this RFC is not acceptable behavior. You were warned, and now you're out.

schmijos commented 2 years ago

I'm late to the conversation, but couldn't we add an organization attribute to Gem::Specification? And then modify gem install to allow an organization attribute?

I'm really not following the purpose of the '@' for scoping, and why the org name wouldn't be enough scoping.

@halostatue I agree with @djberg96 mainly for the reason that we already have got possibilities to "scope" gems. We can already do git, github, ref, branch and whatsoever.

mullermp commented 2 years ago

I must as well note that I am fairly surprised that such a discussion happens without @matz in sight. And @dhh. And @ko1. And @tenderlove, @hone, @amatsuda, @tmm1, @kobaltz, and all the other prominent locomotives and shapers of the Ruby world.

I would love for the "prominent locomotives and shapers of the Ruby world" to comment and contribute to this RFC!

kobaltz commented 2 years ago

I think that this could be a good change for the future of Ruby as a whole. My main concern is backwards compatibility with existing applications. Though, they're probably running an older version of Ruby and rubygems anyways, so it likely wouldn't matter as long as the API was backwards compatible.

Ultimately, what is the goal of scoped gems and what problem is it solving? Based on the conversations above, it looks clear that the scoped gems is providing "confidence" that a consumed gem is from a certain organization. If this is the goal, then sure, it is moving in the right direction.

However, there was other mentions of consuming potentially malicious gems and squatting on names. Sure, this would help combat the squatting on gem names as things are now scoped. However, I'd push back on the malicious gems bit. Someone could create a fake org scope like hotwire instead of hotwired and then publish something malicious there. At a glance, it may look legit. If this is a main reason for the scoped gems, I don't think it will solve the problem that it is aiming to solve. Although, if there is a requirement for an organization to be verified in order to gain access to the scoped gems, then we may be on track to having more legitimacy to the new convention.

As far as the naming convention. I'll go with the flow. I don't have a preference on gem@scope or scope@gem or @scope/gem. gem@scope does give a more normal feel as it is similar to user@server.

mullermp commented 2 years ago

I want to thank everyone for providing their feedback and perspectives on this. I think the next step here is to consolidate/parse the feedback and determine what changes are needed. At a glance, the idea seems to be overwhelmingly positive, but the approach (naming and usage) is mixed. I understand that we can't please everyone. Hopefully we can strike a happy medium here.

bkuhlmann commented 2 years ago

André: l didn’t see anything in your post to address “scoped forks”, like the Rails org creating their own Mail gem that is an alternative/replacement for the global Mail gem. How would you handle that?

Yeah, fair point. I don't address that very well and I'm not sure I have a good answer other than what I commented on earlier and what Samuel mentions in his comment (i.e. the Rails Mail gem example) without thinking through the directory and URL path design a bit more (as well as eliciting more feedback). I agree there are caveats to think through and address better. Something that would be a huge help is to see the RFC be brought up-to-date with the current discussion so, at a high level, everyone is back on the same page and can help progress the design even further.

Matt: Maybe you can update your RFC -- if you are not already in the process of doing this -- to detail the directory path design as mentioned in these comments and discussion? If your RFC was brought up-to-date with the current discussion then it'd be easier to iterate on this a bit more?

ioquatix commented 1 year ago

Would it be helpful for me to write a RFC for the proposal I outlined too? Even if it's just as a counter point?