ruby-i18n / i18n

Internationalization (i18n) library for Ruby
MIT License
977 stars 408 forks source link

[BUG] i18n translation loading is not thread-safe #643

Closed mensfeld closed 1 year ago

mensfeld commented 1 year ago

What I tried to do

Try to load several languages in different threads:

# irb -r ./lib/i18n
100.times.map do
  Thread.new do
    I18n.backend.store_translations(rand.to_s, :foo => { :bar => 'bar', :baz => 'baz' })
  end
end.each(&:join)

Check how many languages are loaded:

I18n.available_locales.count #=> 1

Consecutive executions give correct result:

# run after initial count 1
100.times { Thread.new { I18n.backend.store_translations(rand.to_s, :foo => { :bar => 'bar', :baz => 'baz' }) } }
I18n.available_locales.count #=> 101

What I expected to happen

I expected all the languages to be loaded and not one.

What actually happened

This: https://github.com/ruby-i18n/i18n/blob/master/lib/i18n/backend/simple.rb#L71

@translations ||= Concurrent::Hash.new { |h, k| h[k] = Concurrent::Hash.new }

is not thread-safe. The hash initialization for a given language may not be set correctly because it will be evaluated several times.

Versions of i18n, rails, and anything else you think is necessary

i18n taken from master.

Other info

Happy to fix this with a PR. This may seem like a weird edge case though the side effects may also be weird: some translations may not be loaded as expected while others may work and it may be close to impossible to figure out the reason.

nijikon commented 1 year ago

Refs https://github.com/ruby-concurrency/concurrent-ruby/issues/970

radar commented 1 year ago

Yeah, that definitely seems like a bug. Could you please submit a PR to fix this up?

mensfeld commented 1 year ago

@radar there are two ways to fix this and want to make sure I pick one that is more suitable:

  1. Replace Concurrent::Hash with Concurrent::Map including checks that their API coverage is the same for i18n.
  2. Wrap the initialization operation with a Mutex.

Any of this will work. I would prefer 2 because it brings less risk to the table and mutex will apply only for initial loading, not for the cached state.