rubygems / rfcs

RubyGems + Bundler RFCs
45 stars 40 forks source link

[Bundler] RFC Proposal: “include gem as” feature #54

Open byroot opened 3 months ago

byroot commented 3 months ago

Context

A problem that has always existed forever, but is of course becoming bigger as time passes, is abandoned gems. You may depend on a gem, and one day when trying to upgrade Ruby or another gem realize it now has a very simple compatibility issue, but the original maintainer is no longer active, hence the gem will probably never receive an update.

If the gem is a “leaf”, meaning no other gems depend on it, it’s easy to just fork it and publish it under another name. However if other gems declare a dependency on it, the solutions are much more limited. You can reference a git fork and include the gem that way, which works but tends to be a lone solution rather than a community one. It’s rare, and rather not recommended to point your Gemfile at someone else’ fork of a repository.

Another possibility is to try to reclaim that gem name to become the new maintainer, but it’s slow and rarely works.

Proposal

I believe this community problem could largely be eased if it was possible to declare that a gem is standing up for another one in your Gemfile.

gem "console-formatter" # has a dependency on "left-pad"
gem "left-padder", as: "left-pad"

In the above example we assume that left-pad, a very useful and popular gem that is pulled as a dependency by dozens of bigger gems, is no longer working with a future version of Ruby, and the maintainer is no longer active.

I, as another open source contributor, can fork that gem, give it another name (left-padder) and then instruct users that they can replace left-pad with it, by specifying it in their Gemfile.

If the situation persists and the left-pad maintainer remains inactive, the various gems that depend on left-pad can progressively decide to update their dependency declaration, but in the meantime users have a way out of the problem, and are unblocked.

Additional Use Cases

Such a feature would also allow for secondary use cases.

For instance, maintaining alternative versions of active gems that include features or changes the maintainer doesn’t wish to include upstream.

Another use case is for community maintained versions of EOL releases. If left-pad is currently at version 2, and the version 1 is EOL but a large community of users are still on version 1, they can maintain a left-pad-lts gem if they want.

indirect commented 3 months ago

Some form of this idea has been floating around for a long time, and I think we need more details to know whether this seems worth the effort that would be required. The RFC template document contains a list of the kinds of questions that we would want answered in order to consider accepting an RFC for this feature. Could you add answers to those questions?

The biggest worry that comes immediately to mind for me is "how do we keep a feature like this from harming maintainers?" For example, if I maintain left-pad, but someone else is using left-padder, how do we ensure that errors are reported to left-padder and not to left-pad?

byroot commented 3 months ago

Could you add answers to those questions?

Many don't seem to really apply here, but let's try:

Why are we doing this? What use cases does it support? What is the expected outcome?

I think I mostly covered this already. I want to make it less of a pain for the community when a project stop being maintained, but remain largely used.

Why should we not do this?

I can't think of any reason not to, aside from implementation challenge if it turns out it's really tricky to implement.

Why is this design the best in the space of possible designs?

I don't really see any other solutions to this problem.

What other designs have been considered and what is the rationale for not choosing them?

I haven't considered any other design, because I can't think of any other.

What is the impact of not doing this?

The Ruby community will continue to have to fight whenever a popular, but no longer maintained gem stop working. I could give examples, but I really don't want to look like I'm pointing fingers. It's totally fine for a maintainer to just move on, they don't owe their users anything.

What parts of the design do you expect to resolve through the RFC process before this gets merged?

I think the design is quite simple, so I don't really see much needed discussion on it. It's more about wanting or not wanting this feature, and how to implement it.

The biggest worry that comes immediately to mind for me is "how do we keep a feature like this from harming maintainers?" For example, if I maintain left-pad, but someone else is using left-padder, how do we ensure that errors are reported to left-padder and not to left-pad?

It would probably happen once in a while, but for the most part:

byroot commented 3 months ago

the various gems that depend on left-pad can progressively decide to update their dependency declaration, but in the meantime users have a way out of the problem, and are unblocked.

So one thing I overlooked a bit here, is that while this process is going on, some users may have both left-pad and left-padder pulled as transitive dependencies, so they'll likely run into a name clash until they explictly add gem "left-padded", as: "left-pad" in their Gemfile.

So this part of the workflow isn't exactly seamless, but I can't really think of a solution here. Perhaps the gemspec could include a list of "conflicting" gems like .deb has, so that bundler can give a clearer error. But adding to gemspec is I believe much more involved than "just" adding an option to bundler.

indirect commented 3 months ago

Maybe I am not as optimistic as you, but when you say "while the process is going on", it sounds to me like you are describing how things will end up permanently in the end. I imagine most applications will eventually have 2 gems that depend on left-pad, 3 gems that depend on left-padder, and 1 gem that depends on left-pad-lts.

That means you, the application developer, are personally responsible for troubleshooting the unexpected interactions between those 9 gems, and the way that all three left-pad gems work differently. The only escape I know of is for the application developer to fork those gems and update them to all use the same dependency. And that is already the current situation, without this feature existing.

Since it seems impossible to know right now if I am right, what if we tried it as an experiment? For example, we could add plugin hooks to Bundler to allow this feature to be offered as a plugin, or we could ship this feature behind a setting named BUNDLER_UNSTABLE_ALPHA_GEM_REPLACEMENT_MAY_DISAPPEAR_ANY_TIME or something like that.

Hopefully having some users actually try it out would give us enough information to know if we should release the feature widely or if changes need to be made for that to be a good idea.

simi commented 3 months ago

@byroot the left-pad allusion is clear, but would you mind to share any real problem this is going to solve? From history, I do remember mostly only mail gem Ruby 3.1 support delay (if I remember well, it was fixing just warnings).

My opinion is negative on this feature, since IMHO we should prefer the dependency update and release as a preferred way to fix the problem, which works well in RubyGems.org ecosystem and keeps it "healthy" without dramatic fork rate. And if dependency you rely on is not maintained, clearly you should not use it.

Feature like this must be the last chance to solve the problem, not the easiest one. If introduced, I would prefer to make it not easy to enable and leave it annoying when kept as permanent solution. Strangely long ENV variable + huge warning on each bundle operation could be the good starting point.

Btw. similar discussions already happened few times at https://github.com/rubygems/rubygems/issues/1746, https://github.com/rubygems/bundler/issues/1549, https://github.com/rubygems/bundler-features/issues/20 and https://github.com/rubygems/bundler/issues/4552.

byroot commented 3 months ago

The only escape I know of is for the application developer to fork those gems and update them to all use the same dependency.

Well, no because gem as: allow to solve this without forking.

what if we tried it as an experiment?

If the feature was available as a plugin or behind some feature flag I'd happily use it and report back any pain points etc.

would you mind to share any real problem this is going to solve?

Alright, I didn't want to point any fingers, but here a few examples:

But more generally, whenever a gem is no longer maintained, it creates a mess, and if it breaks on a newer version of Ruby of another gems, it create problems for the whole community and can be source of stagnation (e.g. Ruby giving up on a positive changes because it break important, but no longer maintained gems).

Also importantly, in the next couple years, Ruby is scheduled to default to frozen string literals, and I fear this will unfortunately uncover a number of abandoned gems, and I'd like for the community to have a way to handle this, and for new maintainers to emerge.

we should prefer the dependency update and release as a preferred way to fix the problem, which works well in RubyGems.org ecosystem and keeps it "healthy" without dramatic fork rate.

Do we have examples of this working properly before? How many time abandoned gems have been handed over to new maintainers by the rubygems team?

If tomorrow morning I get hit by a bus, a large number of important gems will have no maintainer any more, and no-one will be able to update them so they keep working. That's a problem.

byroot commented 3 months ago

Also a simpler implementation of this feature could be to just mark a gem being "de-activated", e.g.:

gem "some-gem-that-depend-on-left-pad"
gem "left-pad", disabled: true
gem "left-padder"

disabled: true would:

This would essentially provide the same capabilities.

simi commented 3 months ago

... unicorn ...

Yes, I'm aware of this whole situation. Gem still could be released without MFA, but it is not happening. This gem is clearly in strange state. There is current maintainer is even claiming gem is not recommended (I would read deprecated). If there's no way to help this gem maintain, clearly fork is the way to go. Luckily this gem is not part of gem dependencies and should be easy to fork and migrate projects to.

... httpclient ...

It seems there is some activity by maintainer and things are slowly moving forward. AFAIK there is still enough time before Ruby 3.4 release. Indeed would be great to provide PR removing whole bundling CA stuff to make it easily maintainable for the future.

But more generally, whenever a gem is no longer maintained, it creates a mess, and if it breaks on a newer version of Ruby of another gems, it create problems for the whole community and can be source of stagnation (e.g. Ruby giving up on a positive changes because it break important, but no longer maintained gems).

Let's fight this with actions preventing this from happening. If gem is no longer maintained, IMHO you should not use it. If RubyGems/Bundler provides easy way to replace gems with alternative ones, IMHO that would be mess since forks and alternative releases will start to randomly be pushed and used with various level (IMHO usually none) of maintenance.

Do we have examples of this working properly before? How many time abandoned gems have been handed over to new maintainers by the rubygems team?

I don't remember any actually. I mean Ruby ecosystem seems healthy enough on its own (without RubyGems.org team intervention) to prepare majority of used gems before or quickly after Ruby new version release.

If tomorrow morning I get hit by a bus, a large number of important gems will have no maintainer any more, and no-one will be able to update them so they keep working. That's a problem.

That's clearly problem and you should invite more maintainers to those projects. If I remember well, something similar is checked by https://securityscorecards.dev/#the-checks. We can potentially scan for gems in similar situation (reaching some downloads threshold having 1 maintainer) and suggest to owners to invite more people. Currently, in similar situation, we would receive request for adding maintainer to gem which is first validated and somehow resolved by RubyGems.org support team.

byroot commented 3 months ago

Luckily this gem is not part of gem dependencies and should be easy to fork and migrate projects to.

It is, See: https://rubygems.org/gems/unicorn/reverse_dependencies and https://github.com/unicorn-ruby/unicorn/issues/1

It seems there is some activity by maintainer and things are slowly moving forward.

Didn't noticed a few things got merged 3 months ago, but still no release in sight and the issue has been going for years now.

If gem is no longer maintained, IMHO you should not use it

Easier said than done when there is often a whole dependency tree pulling it. For some important gems it means dozens of other gems coordinating etc. It would be much better for everyone if taking over maintenance under another name was simplified.

IMHO that would be mess since forks and alternative releases will start to randomly be pushed and used with various leve

I don't see how it's much different from deciding to depend on a new gem. You look at the maintainer and their track record and decide if you trust them.

I mean Ruby ecosystem seems healthy enough on its own

I'm not saying it's unhealthy. But to give you context, I take care of testing Shopify's monolith against ruby-head and clearing compatibility issue with our 700+ transitive dependencies. This is a very frequent problem for me. I believe you may not notice it, but at any point in time I always have dozens of pull requests open on repos in various state of maintenance. Most eventually get merged, but when it isn't, then it become a mess of instead contributing to reverse dependencies to try to eliminate the package.

That's clearly problem and you should invite more maintainers to those projects.

Competent and trusted maintainers don't exactly grow on trees. But to be very frank with you this answer annoys me a bit, because I point a problem, propose a solution for when it happens, and I'm basically told the problem don't exist, and that if it existed it would be my fault...

To try to come up to the problem from another angle, I, as the developer of an application with a Gemfile, would like the capability to override some gemspecs. It's great that gems automatically pull their dependencies, but I don't see why I'm simply not allowed to override that if I want to do it for whatever reason.

simi commented 3 months ago

But to be very frank with you this answer annoys me a bit, because I point a problem, propose a solution for when it happens, and I'm basically told the problem don't exist, and that if it existed it would be my fault...

I'm sorry for this, it wasn't my intention. I'm maintainer of big project (I mean big Gemfiles) and I do understand this problem exists. I'm just afraid everyone will roll own fork on initial problem and I'm trying to find out balance in between those two approaches.

I'm not saying it's unhealthy. But to give you context, I take care of testing Shopify's monolith against ruby-head and clearing compatibility issue with our 700+ transitive dependencies. This is a very frequent problem for me. I believe you may not notice it, but at any point in time I always have dozens of pull requests open on repos in various state of maintenance. Most eventually get merged, but when it isn't, then it become a mess of instead contributing to reverse dependencies to try to eliminate the package.

Would in this case the proposed solution (keep the feature behind some strange ENV and provide warnings or so) works for you for this usecase? That way it could potentially prevent mis-usage of this feature.

byroot commented 3 months ago

Would in this case the proposed solution (keep the feature behind some strange ENV and provide warnings or so) works for you for this usecase?

It would solve the problem for me yes, I'd like to also solve it for the community, but if you don't want to I'll take what I can get.

simi commented 3 months ago

It is, See: https://rubygems.org/gems/unicorn/reverse_dependencies and https://github.com/unicorn-ruby/unicorn/issues/1

It seems there are 2-3 main gems which could be forked next to unicorn. :pray:

simi commented 3 months ago

It would solve the problem for me yes, I'd like to also solve it for the community, but if you don't want to I'll take what I can get.

Clearly we all would like to solve this problem for everyone, but we need to find out how. IMHO this could be good starting point. By I'd like to also solve it for the community, is your intention to maintain forks (but keeping the name) so everyone can use them using "overriding feature"?

byroot commented 3 months ago

is your intention to maintain forks (but keeping the name) so everyone can use them using "overriding feature"?

Yes, and also to allow others to do the same.

Basically the capability I'm asking for already exist via the gem git: feature, and it's not rare other companies are running Shopify forks of gems to solve this class of problems. But as you know this isn't great because:

But more importantly, git gems don't open the door to reverse dependencies migrating to the new maintained fork.

simi commented 3 months ago

@byroot yes, I agree git gems are not best practice.

I take care of testing Shopify's monolith against ruby-head and clearing compatibility issue with our 700+ transitive dependencies.

This is very noble of you and Shopify, since whole ecosystem really benefits from this and clearly we should find a way to make this as easy as possible to continue.

What about scoped gems? Would that work for your usecase? Let's say there are scoped like on npm and you can push gem under same name into Shopify scope and in gemfile specify like gem 'httpclient', scope: 'Shopify'. Would that solve your particular problem and do you think that would make it simpler for everyone?

I'm really affraid to get in situation where are bunch of "random" gems like httpclient-shopify, httpclient-josef, httpclient-fixed and nobody really focus on fixing original httpclient one...

byroot commented 3 months ago

What about scoped gems? Would that work for your usecase?

IMHO it's exactly the same feature. Whether the namespace is in the gem name itself, or in an additional "scope" field don't change much for me. But if you think that solves some of the concern of my proposal, then 👍.

byroot commented 3 months ago

NB: the idea of namespace came up when I first suggested this feature internally, but I didn't propose it because it seemed like a much more radical change to both Rubygems and Bundler, so much less likely to be accepted and much harder to implement.

simi commented 3 months ago

What about scoped gems? Would that work for your usecase?

IMHO it's exactly the same feature. Whether the namespace is in the gem name itself, or in an additional "scope" field don't change much for me. But if you think that solves some of the concern of my proposal, then 👍.

In my thinking this is actually different, since instead of (originally suggested) approach of prentending "httpclient-shopify" is actually "httpclient", this works in way saying there is "httpclient", but in different source which actually reflects better the reality. :thinking:

NB: the idea of namespace came up when I first suggested this feature internally, but I didn't propose it because it seemed like a much more radical change to both Rubygems and Bundler, so much less likely to be accepted and much harder to implement.

It is in discussions for long time and it starts to be problem on various levels. And your usecase seems not caring much about backward incompatibilities. You're going to provide alternative to gem, not the only version in your scope. That way users can use the global scope gem, and if your scoped version is needed, it is ok to ask to update bundler to onboard this specific feature. :thinking:

ioquatix commented 3 months ago

I support this feature and it has precedent in other package managers, e.g. pacman.

byroot commented 3 months ago

this works in way saying there is "httpclient", but in different source which actually reflects better the reality. 🤔

Oh I see. So namespaces would essentially be equivalent to host your own gem server, so it's not as different as I thought and kinda piggy back on existing functionality.

It's indeed something we did in a couple times, re-publish a modified gem in our private repository, and we stopped doing it because was a bit confusing. But as suppose the scope: parameter you mention would make it more usable.

simi commented 3 months ago

I support this feature and it has precedent in other package managers, e.g. pacman.

@ioquatix Would you mind to expand on your use-cases for this? pacman is system package manager having different requirements (like usually only 1 version per package) and problems.

ioquatix commented 3 months ago

@ioquatix Would you mind to expand on your use-cases for this? pacman is system package manager having different requirements (like usually only 1 version per package) and problems.

@simi I believe you already linked to my original ticket: https://github.com/rubygems/rubygems/issues/1746 which I think explains my use cases and the connections with pacman. I think https://github.com/rubygems/rubygems/issues/1746#issuecomment-694336001 is a good summary of the two main options (not mutually exclusive either). In addition, the unicorn fork would also be made easier by the mechanism proposed here.

I'm really affraid to get in situation where are bunch of "random" gems like httpclient-shopify, httpclient-josef, httpclient-fixed and nobody really focus on fixing original httpclient one...

I don't think you need to be worried about forks becoming more prolific, as the overhead of creating, releasing and maintaining a fork is not changed by anything we do here. Given that you've talked about transitive dependencies (that may still be maintained), I think this proposal will lead to less forks, as we can reuse existing gems without having to fork them just to change the dependency resolution.

Also, it's unfair to try and place the burden of this problem on the end users/developers. Rejecting this feature on the basis that users could do more (fork, maintain, offer to be maintainers, etc) does not align up with the body of evidence presented. Giving users autonomy to fix issues, rather than being at the mercy of the maintainers who may be absent, is empowering and good for the community.

I see two (non-exclusive) paths forward.

Application Centric (Gemfile / gems.rb)

In your own application, you want to solve a problem by using a forked gem. You would modify your gemfile like so:

# In Gemfile / gems.rb

# We depend on `unicorn-maintained` gem, and declare that it can replace any dependency on `unicorn`:
gem 'unicorn-maintained', provides: 'unicorn'
gem 'unicorn-worker-killer' # this has a dependency on `unicorn` satisfied by `unicorn-maintained`.

# Generally:
# gem name, provides: alias
# gem name, provides: [alias1, alias2, alias3]

For the sake of dependency resolution, any gem that depends on unicorn of any version specifier, would be satisfied by unicorn-maintained.

Regarding naming, I suggest provides as this word is commonly used in package managers for this purpose.

Library Centric (gemspec.rb)

In your fork of a library, you want to make it easy for users to consume your gem as a replacement or alternative. You would modify your forked gemspec like so:

# unicorn-maintained.gemspec

Gem::Specification.new do |spec|
  spec.name = "unicorn-maintained"
  spec.version = "6.2.0"
  # ... snip ...

  # 
  spec.provides "unicorn" # , "6.0.0" (optional version)

  # Optional but nice to have for better error reporting:
  spec.conflicts "unicorn"
end

Anyone who depends on unicorn-maintained will automatically have any dependency on unicorn satisfied because of the spec.provides "unicorn" line.

The use of a conflicts line allows for better error reporting on dependency resolution failure.

ioquatix commented 3 months ago

Let me put my money where my mouth is: If I made a PR for (1), is there a reasonable chance it would be acceptable?

indirect commented 3 months ago

Sure, I think we should try this out in a plugin or behind a feature flag that makes it clear it is an experiment. If the feedback from testers is that it is more helpful than it is confusing, we can consider what it would take to promote it to a default feature.