Closed ioquatix closed 1 year ago
I don't think we should merge this, it doesn't solve anything and it makes a very dangerous precedent.
The API was clearly marked as internal, there is no surprise, whoever hardcoded that in their code should know it was just a time bomb, or worst case they learn it now. That's fine and that's how we learn to be better programmers and establish a sane relationship with our dependencies.
So, I want to make it very clear, that I'm neither in favour nor against this PR. It's one option and I think as maintainers we'd have to all be on board with merging it. I want to make sure that this discussion is respectful for all involved and if someone decides not to merge this PR, it's a fair choice to be made and a valid outcome.
With that in mind, I think this PR is a pragmatic option but I would also argue not the "correct" option.
If we did decide to accept this PR, I think we'd like to do so without setting an expectation that similar issues would be handled in the same way in the future - this is a pragmatic choice given the magnitude of the problem.
Regarding puppet, doesn't it pin to a specific concurrent-ruby release? That should avoid any immediate breakage. Is concurrent-ruby automatically updated for puppet users? That seems unexpected if it's the case. If it's only breaking puppet's CI, I feel there is no rush, it's trivial to fix it there.
And then it's a question of should puppet do a release to fix this, or should concurrent-ruby do multiple releases to workaround? My view is clear, I don't have time this week (nor next week) for another concurrent-ruby release, it's quite some work with the windows builds, edge, etc (and then of course all the reasons above).
My understanding is a new puppet release would fix this. That sounds like the obvious solution with no drawbacks. I don't see why we would do anything more complicated. If people use a pinned/old version of puppet they should also pin the dependencies (e.g. via Gemfile.lock).
Those are all fair points. Maybe @joshcooper can give us feedback on the scale of the impact for users.
Puppet had a dependency on concurrent-ruby ~> 1.0
which means as of yesterday the 1.2 release gets pulled in by any gem install
or bundle install
operation that solves dependencies.
New releases are in the works that change the dependency to concurrent-ruby ~>1.0, <1.2
. However, the places where bundle install
is being used are commonly CI pipelines that test Infrastrcture-as-a-Code changes for environments that use Puppet to manage OS and application configuration.
There are thousands of such pipelines out in the wild and they have a long tail of Puppet versions in use. So, a new release of Puppet will help, but won't clear the error for environments that are pinned to older Puppet versions for $reasons
. Those folks would have to take up the newest Puppet releases or update the Gemfile
s used by failing CI pipelines to add something like gem 'concurrent-ruby', '~> 1.0', '< 1.2.0'
(a reasonable solution for a handful of pipelines, a daunting task for hundreds).
Once a new release of puppet is released, with that version, gem install puppet
will pull in the correct version of concurrent-ruby
, even for existing pipelines right?
will pull in the correct version of concurrent-ruby, even for existing pipelines right?
For existing pipelines that are not pinned to an older version, yes. There is a long, heavy, tail of pipelines out in the wild that are pinned to older versions.
So I see 4 cases:
gem install puppet
) or latest Puppet 6 ( gem install puppet -v '~> 6'
). This is fixed in Puppet 7.22.0 and 6.29.0, so already no longer a problem for that case.gem install puppet -v 7.21.0
). Those users will regularly get issues, I believe this is neither the first nor the last time that picking the latest gem according to gemspec would break. It's the typical trade-off: either you use latest and get latest bug fixes, or you pin everything (that includes Ruby, OS, OS version, arch BTW). Anything in between just breaks with enough time passing. As indication this is no the first time I found e.g. https://github.com/asottile/personal-puppet/blob/78e57218989596ef2704512c339320df28456315/run-puppet#L20-L23.It seems merging this PR would only help the 4th case, but those did anyway already get such issues and will get more in the future. The real solution for them is to migrate to the other cases (use latest or pin everything or use system package manager that pins for you).
As Puppet's own docs say, the official way to install puppet (as far as I can see) is to use a system package manager: https://www.puppet.com/docs/puppet/7/install_puppet.html#install_puppet. There is no gem install puppet -v x.y.z
in there (as far as I can see).
The other 2 documented ways are also system package and from source (works fine due to updated gemspec).
So in my understanding this PR would only help Puppet users which don't use an officially-supported way to install Puppet (as far as I can see), and don't know yet how to run software reliably. I see no reason to merge this PR at this point, it would only solve the 4th category, and that approach will break many times in the future and in the past too due to various incompatibilities between gem versions which were never tested together.
Also I have 2 main concerns:
ThreadLocalVar
has the same behavior as RubyThreadLocalVar
for all Rubies so it should work well.As you can see from this long reply and my time investment in this, I try to carefully understand the situation and I sympathize that things broke and that's never fun. I explained above the 3 solutions so this doesn't happen again for that 4th category of puppet users. I believe it is necessary for healthy maintenance of concurrent-ruby that we do not merge this, and also it would only be a small band-aid for cases which would anyway break again due to some other issue of unpinned dependencies in a few months.
Hi @eregon thanks for your detailed reply. It's true that this issue doesn't affect users that installed puppet via system package manager (case 3). But it does affect users that test their puppet modules against different versions of puppet (case 2 and 4), which is very common when you have 10k+ agents installed and want to ensure they can be upgraded seamlessly over time.
It's fine, nothing broke in production, just need to decide whether to pick newer puppet or older concurrent-ruby.
To @Sharpie point, it's ok if you have a handful of pipelines to update. But if you have thousands, then it's not that simple.
Also I have 2 main concerns ... fine as a quick fix, but doesn't fix the reliance on the internals
Yes, indeed, I filed https://tickets.puppetlabs.com/browse/PUP-11723 so that we're not permanently stuck on < 1.2.0. If I can dig up what the original issue was that caused us to use RubyThreadLocalVar
on JRuby, then I'll forward it.
I try to carefully understand the situation and I sympathize that things broke and that's never fun
Thanks, I appreciate this as a fellow open source maintainer.
Fixes https://github.com/ruby-concurrency/concurrent-ruby/issues/986.
@joshcooper can you please check this will fix your issue.