snowplow / emr-etl-runner

Run Snowplow's enrichments on Amazon Elastic MapReduce with minimum fuss
0 stars 5 forks source link

EmrEtlRunner: Catch and retry EMR connection issues #70

Closed istreeter closed 4 years ago

istreeter commented 4 years ago

It is common http requests to fail between EmrEtlRunner and EMR because of connection issues. There are some errors that we catch and handle, but there are other connection issues that we do not catch, depending on where they happen in the code

This issue is to address the following 4 connection errors that we have observed:

connection issue 1

OpenSSL::SSL::SSLError: Socket closed
          connect_nonblock at org/jruby/ext/openssl/SSLSocket.java:255
        ssl_socket_connect at uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/net/protocol.rb:44
                   connect at uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/net/http.rb:985
                  do_start at uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/net/http.rb:924
                     start at uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/net/http.rb:913
                  transmit at uri:classloader:/gems/rest-client-1.8.0/lib/restclient/request.rb:413
                   execute at uri:classloader:/gems/rest-client-1.8.0/lib/restclient/request.rb:176
                   execute at uri:classloader:/gems/rest-client-1.8.0/lib/restclient/request.rb:41
                    submit at uri:classloader:/gems/elasticity-6.0.14/lib/elasticity/aws_session.rb:34
         add_jobflow_steps at uri:classloader:/gems/elasticity-6.0.14/lib/elasticity/emr.rb:63
                  add_step at uri:classloader:/gems/elasticity-6.0.14/lib/elasticity/job_flow.rb:166
              block in run at uri:classloader:/emr-etl-runner/lib/snowplow-emr-etl-runner/emr_job.rb:767
                      each at org/jruby/RubyArray.java:1801
                       run at uri:classloader:/emr-etl-runner/lib/snowplow-emr-etl-runner/emr_job.rb:763
                       run at uri:classloader:/emr-etl-runner/lib/snowplow-emr-etl-runner/runner.rb:138
                    <main> at uri:classloader:/emr-etl-runner/bin/snowplow-emr-etl-runner:41
                      load at org/jruby/RubyKernel.java:994
                    <main> at uri:classloader:/META-INF/main.rb:1
                   require at org/jruby/RubyKernel.java:970
                    (root) at uri:classloader:/META-INF/main.rb:1
                    <main> at uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/rubygems/core_ext/kernel_require.rb:1
ERROR: org.jruby.embed.EvalFailedException: (SSLError) Socket closed

connection issue 2

OpenSSL::SSL::SSLError: Connection reset by peer
          connect_nonblock at org/jruby/ext/openssl/SSLSocket.java:255
        ssl_socket_connect at uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/net/protocol.rb:44
                   connect at uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/net/http.rb:985
                  do_start at uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/net/http.rb:924
                     start at uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/net/http.rb:913
                  transmit at uri:classloader:/gems/rest-client-1.8.0/lib/restclient/request.rb:413
                   execute at uri:classloader:/gems/rest-client-1.8.0/lib/restclient/request.rb:176
                   execute at uri:classloader:/gems/rest-client-1.8.0/lib/restclient/request.rb:41
                    submit at uri:classloader:/gems/elasticity-6.0.14/lib/elasticity/aws_session.rb:34
          describe_cluster at uri:classloader:/gems/elasticity-6.0.14/lib/elasticity/emr.rb:93
            cluster_status at uri:classloader:/gems/elasticity-6.0.14/lib/elasticity/job_flow.rb:190
            cluster_status at uri:classloader:/emr-etl-runner/lib/snowplow-emr-etl-runner/emr_job.rb:1204
                       run at uri:classloader:/emr-etl-runner/lib/snowplow-emr-etl-runner/emr_job.rb:804
                       run at uri:classloader:/emr-etl-runner/lib/snowplow-emr-etl-runner/runner.rb:138
                    <main> at uri:classloader:/emr-etl-runner/bin/snowplow-emr-etl-runner:41
                      load at org/jruby/RubyKernel.java:994
                    <main> at uri:classloader:/META-INF/main.rb:1
                   require at org/jruby/RubyKernel.java:970
                    (root) at uri:classloader:/META-INF/main.rb:1
                    <main> at uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/rubygems/core_ext/kernel_require.rb:1
ERROR: org.jruby.embed.EvalFailedException: (SSLError) Connection reset by peer"

connection issue 3

Errno::ECONNRESET: Connection reset by peer - Failed to open TCP connection to elasticmapreduce.us-east-1.amazonaws.com:443 (Connection reset by peer - Connection reset by peer)
                   initialize at org/jruby/ext/socket/RubyTCPSocket.java:148
                         open at org/jruby/RubyIO.java:1177
             block in connect at uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/net/http.rb:941
                      timeout at org/jruby/ext/timeout/Timeout.java:149
                      timeout at org/jruby/ext/timeout/Timeout.java:122
                      connect at uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/net/http.rb:939
                     do_start at uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/net/http.rb:924
                        start at uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/net/http.rb:913
                     transmit at uri:classloader:/gems/rest-client-1.8.0/lib/restclient/request.rb:413
                      execute at uri:classloader:/gems/rest-client-1.8.0/lib/restclient/request.rb:176
                      execute at uri:classloader:/gems/rest-client-1.8.0/lib/restclient/request.rb:41
                       submit at uri:classloader:/gems/elasticity-6.0.14/lib/elasticity/aws_session.rb:34
                   list_steps at uri:classloader:/gems/elasticity-6.0.14/lib/elasticity/emr.rb:206
          cluster_step_status at uri:classloader:/gems/elasticity-6.0.14/lib/elasticity/job_flow.rb:197
  cluster_step_status_for_run at uri:classloader:/emr-etl-runner/lib/snowplow-emr-etl-runner/emr_job.rb:1166
                          run at uri:classloader:/emr-etl-runner/lib/snowplow-emr-etl-runner/emr_job.rb:763
                          run at uri:classloader:/emr-etl-runner/lib/snowplow-emr-etl-runner/runner.rb:138
                       <main> at uri:classloader:/emr-etl-runner/bin/snowplow-emr-etl-runner:41
                         load at org/jruby/RubyKernel.java:994
                       <main> at uri:classloader:/META-INF/main.rb:1
                      require at org/jruby/RubyKernel.java:970
                       (root) at uri:classloader:/META-INF/main.rb:1
                       <main> at uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/rubygems/core_ext/kernel_require.rb:1
ERROR: org.jruby.embed.EvalFailedException: (ECONNRESET) Connection reset by peer - Failed to open TCP connection to elasticmapreduce.us-east-1.amazonaws.com:443 (Connection reset by peer - Connection reset by peer)"

connection issue 4

Errno::ECONNRESET: Connection reset by peer - Failed to open TCP connection to elasticmapreduce.us-east-1.amazonaws.com:443 (Connection reset by peer - Connection reset by peer)
                initialize at org/jruby/ext/socket/RubyTCPSocket.java:148
                      open at org/jruby/RubyIO.java:1177
          block in connect at uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/net/http.rb:941
                   timeout at org/jruby/ext/timeout/Timeout.java:149
                   timeout at org/jruby/ext/timeout/Timeout.java:122
                   connect at uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/net/http.rb:939
                  do_start at uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/net/http.rb:924
                     start at uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/net/http.rb:913
                  transmit at uri:classloader:/gems/rest-client-1.8.0/lib/restclient/request.rb:413
                   execute at uri:classloader:/gems/rest-client-1.8.0/lib/restclient/request.rb:176
                   execute at uri:classloader:/gems/rest-client-1.8.0/lib/restclient/request.rb:41
                    submit at uri:classloader:/gems/elasticity-6.0.14/lib/elasticity/aws_session.rb:34
              run_job_flow at uri:classloader:/gems/elasticity-6.0.14/lib/elasticity/emr.rb:302
                       run at uri:classloader:/gems/elasticity-6.0.14/lib/elasticity/job_flow.rb:176
                       run at uri:classloader:/emr-etl-runner/lib/snowplow-emr-etl-runner/emr_job.rb:752
                       run at uri:classloader:/emr-etl-runner/lib/snowplow-emr-etl-runner/runner.rb:138
                    <main> at uri:classloader:/emr-etl-runner/bin/snowplow-emr-etl-runner:41
                      load at org/jruby/RubyKernel.java:994
                    <main> at uri:classloader:/META-INF/main.rb:1
                   require at org/jruby/RubyKernel.java:970
                    (root) at uri:classloader:/META-INF/main.rb:1
                    <main> at uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/rubygems/core_ext/kernel_require.rb:1
ERROR: org.jruby.embed.EvalFailedException: (ECONNRESET) Connection reset by peer - Failed to open TCP connection to elasticmapreduce.us-east-1.amazonaws.com:443 (Connection reset by peer - Connection reset by peer)
istreeter commented 4 years ago

This is a duplicate of #4 which has been resolved already