snowplow / snowplow-rdb-loader

Stores Snowplow enriched events in Redshift, Snowflake and Databricks
Other
31 stars 17 forks source link

RDB Loader: mention the config file in configuration error message #119

Closed steve-thousand closed 3 years ago

steve-thousand commented 5 years ago

I am currently running RBD Loader 0.14.0 because when I switch to 0.15.0 it fails during the step "Elasticity Custom Jar Step: Load Redshift Configuration Storage Target". The only information that exists in the stdout log for that step is this

Configuration error
Attempt to decode value on failed cursor: DownField(id)

As you can imagine this is very unhelpful. I know that my job and configuration run well if I rollback to 0.14.0 but at this point I really have no idea where to start looking for the issue with 0.15.0. Any help would be appreciated. Here are my storage versions

storage:
  versions:
    rdb_shredder: 0.13.0
    rdb_loader: 0.14.0
    hadoop_elasticsearch: 0.1.0
chuwy commented 5 years ago

Hi @steve-thousand,

This errors means that RDB Loader cannot find id field in storage target config, which is required since R30. Please follow the upgrade guide from release notes to make your config file is compatible with recent release.

But generally I agree, we can specify the exact file where error encountered.

steve-thousand commented 5 years ago

Apologies for the confusion. I will retry 0.15.0 after following the upgrade plan.

I believe I have found a typo in the upgrade plan, however. I believe where it mentions "sslTunnel" it should instead say "sshTunnel". When I attempted to run with "sslTunnel" in configuration, I received this error in output:

Error in [redshift.json] The property '#/' contains additional properties ["sslTunnel"] outside of the schema when none are allowed
UncaughtThrowError: uncaught throw #<JSON::Schema::ValidationError: The property '#/' contains additional properties ["sslTunnel"] outside of the schema when none are allowed in schema 48c5d58a-fb8e-5af2-94f5-949b21b2b628>
                      throw at org/jruby/RubyKernel.java:1203
  block in validate_targets at uri:classloader:/emr-etl-runner/lib/snowplow-emr-etl-runner/runner.rb:202
                        map at org/jruby/RubyArray.java:2557
           validate_targets at uri:classloader:/emr-etl-runner/lib/snowplow-emr-etl-runner/runner.rb:193
                    send_to at uri:classloader:/gems/contracts-0.11.0/lib/contracts/method_reference.rb:43
                  call_with at uri:classloader:/gems/contracts-0.11.0/lib/contracts/call_with.rb:76
   block in redefine_method at uri:classloader:/gems/contracts-0.11.0/lib/contracts/method_handler.rb:138
                 initialize at uri:classloader:/emr-etl-runner/lib/snowplow-emr-etl-runner/runner.rb:99
                    send_to at uri:classloader:/gems/contracts-0.11.0/lib/contracts/method_reference.rb:43
                  call_with at uri:classloader:/gems/contracts-0.11.0/lib/contracts/call_with.rb:76
   block in redefine_method at uri:classloader:/gems/contracts-0.11.0/lib/contracts/method_handler.rb:138
                     <main> at uri:classloader:/emr-etl-runner/bin/snowplow-emr-etl-runner:40
                       load at org/jruby/RubyKernel.java:994
                     <main> at uri:classloader:/META-INF/main.rb:1
                    require at org/jruby/RubyKernel.java:970
                     (root) at uri:classloader:/META-INF/main.rb:1
                     <main> at uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/rubygems/core_ext/kernel_require.rb:1
ERROR: org.jruby.embed.EvalFailedException: (UncaughtThrowError) uncaught throw #<JSON::Schema::ValidationError: The property '#/' contains additional properties ["sslTunnel"] outside of the schema when none are allowed in schema 48c5d58a-fb8e-5af2-94f5-949b21b2b628>

I noticed the schema specified that "sshTunnel" was required, so I altered my config to mention that and it did not fail on startup.

chuwy commented 5 years ago

Indeed, you're right https://github.com/snowplow/iglu-central/blob/master/schemas/com.snowplowanalytics.snowplow.storage/redshift_config/jsonschema/3-0-0#L72.

Will fix that in blog post.

dilyand commented 5 years ago

The storage target docs have links to outdated schemas and do not make it clear that id is now a required field. I have opened an issue: https://github.com/snowplow/snowplow-rdb-loader/issues/128

chuwy commented 3 years ago

Wontfix as there's just one config file now.