DeepLcom / deepl-rb

Official Ruby library for the DeepL language translation API.
https://www.deepl.com
MIT License
8 stars 2 forks source link

FIX: default `with_session` client options #3

Open rreckonerr opened 1 month ago

rreckonerr commented 1 month ago

The default HTTPClientOptions.new() parameter in with_session method raises an error. According to the README it should be possible to call with_session without passing any arguments.

rreckonerr commented 1 month ago

@JanEbbing Thanks for quickly checking it out!

I have a question, maybe you have an idea of what's going on. I used Concurrent::Promises before to make translation requests in parallel, but upgrading from 2.5.3 to 3.0.1 caused weird errors (like FrozenError can't modify frozen OpenSSL::SSL::SSLContext), so I refactored away to use the with_session. Do you think that it might indicate some thread-safety issues and whether there is a way to make the requests in parallel? I didn't manage to reproduce the issue in tests, unfortunately.

Below you can see the previous Concurrent::Promises usage

@@ -19,8 +19,8 @@ def translate_single(text, source_lang, target_lang, options = DEFAULT_OPTIONS)
   end

   def translate_multiple(text, source_lang, target_langs, options = DEFAULT_OPTIONS)
-    promises = target_langs.map do |target_lang|
-      Concurrent::Promises.future do
+    with_session(DeepL::HTTPClientOptions.new({}, nil)) do |_session|
+      target_langs.to_h do |target_lang|
         translation = translate(text, local_to_external(source_lang), local_to_external(target_lang), options.to_h.deep_dup)
         [external_to_local(target_lang), translation.text]
       rescue DeepL::Exceptions::Error => e
@@ -28,10 +28,6 @@ def translate_multiple(text, source_lang, target_langs, options = DEFAULT_OPTION
         [external_to_local(target_lang), nil]
       end
     end
-
-    results = Concurrent::Promises.zip(*promises).value!
-
-    results.to_h
   end
JanEbbing commented 1 month ago

I'll look into this tomorrow - I did slightly change how the requests are made internally, but I dont think it should cause issues for this use case.

JanEbbing commented 1 month ago

Hi, I cannot reproduce any issues with the following code:

require 'concurrent-ruby'
require 'deepl'

def translate_multiple(text, source_lang, target_langs)
  promises = target_langs.map do |target_lang|
    DeepL.with_session(DeepL::HTTPClientOptions.new({}, nil)) do |_session|
      Concurrent::Promises.future do
        DeepL.translate(text, source_lang, target_lang)
        rescue DeepL::Exceptions::Error => e
          puts e.message
      end
    end
  end
  return Concurrent::Promises.zip(*promises).value!
end

sample_text = 'I am an example sentence in english.' * 1000
target_langs = ['DE', 'ES', 'PT-BR', 'IT', 'ZH-HANS', 'KO']
res = translate_multiple(sample_text, 'EN', target_langs)
puts res

But I need to read a bit more to exclude any thread safety issues, as the with_session will use a Net::HTTP session now - to fix this, we might need to add a different HTTP client as a dependency.

rreckonerr commented 1 month ago

This looks related https://github.com/getsentry/sentry-ruby/pull/1696

For me the error pops up in production reliably, maybe it's related to yjit, which is enabled only in prod in my case, e.g.

RUN apt-get -qq install -y --no-install-recommends libjemalloc2

ENV LD_PRELOAD="libjemalloc.so.2"
ENV MALLOC_CONF="dirty_decay_ms:1000,narenas:2,background_thread:true"
ENV RUBY_YJIT_ENABLE="1"
lasseebert commented 3 weeks ago

I have the same FrozenError when running concurrent threads (with GoodJob in my case) that translate via DeepL. It seems to be because @http_client is reused between requests. Using with_session in each thread fixes it.