fluent-plugins-nursery / fluent-plugin-bigquery

Other
275 stars 98 forks source link

Error streaming to GBQ #168

Open aplsms opened 5 years ago

aplsms commented 5 years ago

Hello,

I'm doing migration form fluentd 0.12 to 1.0

I have error message like " message="Error while reading data, error message: JSON table encountered too many errors, giving up. Rows: 1; errors: 1. Please look into the error stream for more details." reason="invalid"

How to get error stream for details?

Error message in fluentd logs:

2018-10-03 05:11:16 +0000 [error]: #0 job.insert API (rows) job_id="job_0pb2mErCjSc6aXMxEXMiIaWkE1gF" project_id="voltaic-phalanx-757" dataset="prod_logs" table="analyst_server_20181003" message="Error while reading data, error message: JSON table encountered too many errors, giving up. Rows: 1; errors: 1. Please look into the error stream for more details." reason="invalid"
2018-10-03 05:11:16 +0000 [error]: #0 job.insert API (rows) job_id="job_0pb2mErCjSc6aXMxEXMiIaWkE1gF" project_id="voltaic-phalanx-757" dataset="prod_logs" table="analyst_server_20181003" message="Error while reading data, error message: JSON parsing error in row starting at position 0: No such field: stack_trace." reason="invalid"
2018-10-03 05:11:16 +0000 [error]: #0 job.insert API (result) job_id="job_0pb2mErCjSc6aXMxEXMiIaWkE1gF" project_id="voltaic-phalanx-757" dataset="prod_logs" table="analyst_server_20181003" message="Error while reading data, error message: JSON table encountered too many errors, giving up. Rows: 1; errors: 1. Please look into the error stream for more details." reason="invalid"
2018-10-03 05:11:16 +0000 [error]: #0 failed to flush the buffer, and hit limit for retries. dropping all chunks in the buffer queue. retry_times=0 records=10 error_class=Fluent::BigQuery::UnRetryableError error="failed to load into bigquery, and cannot retry"
  2018-10-03 05:11:16 +0000 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluent-plugin-bigquery-1.2.0/lib/fluent/plugin/bigquery/writer.rb:223:in `wait_load_job'
  2018-10-03 05:11:16 +0000 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluent-plugin-bigquery-1.2.0/lib/fluent/plugin/bigquery/writer.rb:170:in `create_load_job'
  2018-10-03 05:11:16 +0000 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluent-plugin-bigquery-1.2.0/lib/fluent/plugin/out_bigquery.rb:453:in `block in load'
  2018-10-03 05:11:16 +0000 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluent-plugin-bigquery-1.2.0/lib/fluent/plugin/out_bigquery.rb:485:in `block in create_upload_source'
  2018-10-03 05:11:16 +0000 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluent-plugin-bigquery-1.2.0/lib/fluent/plugin/out_bigquery.rb:484:in `open'
  2018-10-03 05:11:16 +0000 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluent-plugin-bigquery-1.2.0/lib/fluent/plugin/out_bigquery.rb:484:in `create_upload_source'
  2018-10-03 05:11:16 +0000 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluent-plugin-bigquery-1.2.0/lib/fluent/plugin/out_bigquery.rb:452:in `load'
  2018-10-03 05:11:16 +0000 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluent-plugin-bigquery-1.2.0/lib/fluent/plugin/out_bigquery.rb:446:in `_write'
  2018-10-03 05:11:16 +0000 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluent-plugin-bigquery-1.2.0/lib/fluent/plugin/out_bigquery.rb:340:in `write'
  2018-10-03 05:11:16 +0000 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.2.2/lib/fluent/plugin/output.rb:1099:in `try_flush'
  2018-10-03 05:11:16 +0000 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.2.2/lib/fluent/plugin/output.rb:1378:in `flush_thread_run'
  2018-10-03 05:11:16 +0000 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.2.2/lib/fluent/plugin/output.rb:440:in `block (2 levels) in start'
  2018-10-03 05:11:16 +0000 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.2.2/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create'

Environments

Configuration

   <store>
     @type bigquery
     method load

     <buffer time>
       @type file
       path /fluentd/buffers/GBQ/bigquery.*.buffer
       timekey           24h
       timekey_wait      10m
       flush_interval 30m
       flush_at_shutdown true
       timekey_use_utc
     </buffer>
     <inject>
       time_key time
       time_type string
       # time_format %Y-%m-%d %H:%M:%S
       time_format %s
     </inject>

     auth_method json_key
     json_key json_key_path.json

     auth_method private_key
     email email@developer.gserviceaccount.com
     private_key_path /fluentd/config/prod.ta-google.p12

     project voltaic-phalanx-757
     dataset prod_logs
     table analyst_server_%Y%m%d
     auto_create_table true
     schema [
       {"name": "hostname", "type": "STRING"},
       {"name": "decisionId", "type": "STRING"},
       {"name": "debtorId", "type": "STRING"},
       {"name": "message", "type": "STRING"},
       {"name": "time", "type": "TIMESTAMP"}
     ]
   </store>

old config file that working well:

  <store>
    type bigquery
    project voltaic-phalanx-757
    dataset prod_logs
    table analyst_server_%Y%m%d
    auto_create_table true

    time_format %s
    time_field  time
    field_timestamp time
    field_string hostname,decisionId,debtorId,message

    auth_method private_key
    email email@developer.gserviceaccount.com
    private_key_path /ebs/config/prod.ta-google.p12
  </store>

Expected Behavior

json data appeared in the GBQ

Actual Behavior

Error message="Error while reading data, error message: JSON table encountered too many errors, giving up. Rows: 1; errors: 1. Please look into the error stream for more details." reason="invalid"

Log (if you have)

2018-10-03 05:11:16 +0000 [error]: #0 job.insert API (rows) job_id="job_0pb2mErCjSc6aXMxEXMiIaWkE1gF" project_id="voltaic-phalanx-757" dataset="prod_logs" table="analyst_server_20181003" message="Error while reading data, error message: JSON table encountered too many errors, giving up. Rows: 1; errors: 1. Please look into the error stream for more details." reason="invalid"
2018-10-03 05:11:16 +0000 [error]: #0 job.insert API (rows) job_id="job_0pb2mErCjSc6aXMxEXMiIaWkE1gF" project_id="voltaic-phalanx-757" dataset="prod_logs" table="analyst_server_20181003" message="Error while reading data, error message: JSON parsing error in row starting at position 0: No such field: stack_trace." reason="invalid"
2018-10-03 05:11:16 +0000 [error]: #0 job.insert API (result) job_id="job_0pb2mErCjSc6aXMxEXMiIaWkE1gF" project_id="voltaic-phalanx-757" dataset="prod_logs" table="analyst_server_20181003" message="Error while reading data, error message: JSON table encountered too many errors, giving up. Rows: 1; errors: 1. Please look into the error stream for more details." reason="invalid"
2018-10-03 05:11:16 +0000 [error]: #0 failed to flush the buffer, and hit limit for retries. dropping all chunks in the buffer queue. retry_times=0 records=10 error_class=Fluent::BigQuery::UnRetryableError error="failed to load into bigquery, and cannot retry"
  2018-10-03 05:11:16 +0000 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluent-plugin-bigquery-1.2.0/lib/fluent/plugin/bigquery/writer.rb:223:in `wait_load_job'
  2018-10-03 05:11:16 +0000 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluent-plugin-bigquery-1.2.0/lib/fluent/plugin/bigquery/writer.rb:170:in `create_load_job'
  2018-10-03 05:11:16 +0000 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluent-plugin-bigquery-1.2.0/lib/fluent/plugin/out_bigquery.rb:453:in `block in load'
  2018-10-03 05:11:16 +0000 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluent-plugin-bigquery-1.2.0/lib/fluent/plugin/out_bigquery.rb:485:in `block in create_upload_source'
  2018-10-03 05:11:16 +0000 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluent-plugin-bigquery-1.2.0/lib/fluent/plugin/out_bigquery.rb:484:in `open'
  2018-10-03 05:11:16 +0000 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluent-plugin-bigquery-1.2.0/lib/fluent/plugin/out_bigquery.rb:484:in `create_upload_source'
  2018-10-03 05:11:16 +0000 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluent-plugin-bigquery-1.2.0/lib/fluent/plugin/out_bigquery.rb:452:in `load'
  2018-10-03 05:11:16 +0000 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluent-plugin-bigquery-1.2.0/lib/fluent/plugin/out_bigquery.rb:446:in `_write'
  2018-10-03 05:11:16 +0000 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluent-plugin-bigquery-1.2.0/lib/fluent/plugin/out_bigquery.rb:340:in `write'
  2018-10-03 05:11:16 +0000 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.2.2/lib/fluent/plugin/output.rb:1099:in `try_flush'
  2018-10-03 05:11:16 +0000 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.2.2/lib/fluent/plugin/output.rb:1378:in `flush_thread_run'
  2018-10-03 05:11:16 +0000 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.2.2/lib/fluent/plugin/output.rb:440:in `block (2 levels) in start'
  2018-10-03 05:11:16 +0000 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.2.2/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create'
aplsms commented 5 years ago

just note: in original log we have much more fields, but in the table we have fields exactly listed in schema.