reproio / columnify

Make record oriented data to columnar format.
Apache License 2.0
36 stars 6 forks source link

Need help to resolve the error :- got unrecoverable error in primary and no secondary error_class=Fluent::UnrecoverableError error="failed to execute columnify command. stdout= stderr=Failed to close columnifier: interface conversion: interface {} is nil, not string2024/07/09 22:25:41 Failed to write: interface conversion: interface {} is nil, not string\n status=#<Process::Status: pid 1191 exit 1>" #94

Open wasifshareef opened 1 month ago

wasifshareef commented 1 month ago

HI, I am trying to use columnify to generate output in parquet format to send data to azure using azure plugin,however i am getting error when i run the fluentd container with below error . I will appreciate if someone can help me here.

2024-07-09 22:25:41 +0000 [warn]: #0 got unrecoverable error in primary and no secondary error_class=Fluent::UnrecoverableError error="failed to execute columnify command. stdout= stderr=Failed to close columnifier: interface conversion: interface {} is nil, not string
2024/07/09 22:25:41 Failed to write: interface conversion: interface {} is nil, not string\n status=#<Process::Status: pid 1191 exit 1>"
  2024-07-09 22:25:41 +0000 [warn]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluent-plugin-azurestorage-gen2-0.3.5/lib/fluent/plugin/out_azurestorage_gen2.rb:834:in `compress'
  2024-07-09 22:25:41 +0000 [warn]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluent-plugin-azurestorage-gen2-0.3.5/lib/fluent/plugin/out_azurestorage_gen2.rb:165:in `write'
  2024-07-09 22:25:41 +0000 [warn]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.3/lib/fluent/plugin/output.rb:1225:in `try_flush'
  2024-07-09 22:25:41 +0000 [warn]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.3/lib/fluent/plugin/output.rb:1538:in `flush_thread_run'
  2024-07-09 22:25:41 +0000 [warn]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.3/lib/fluent/plugin/output.rb:510:in `block (2 levels) in start'
  2024-07-09 22:25:41 +0000 [warn]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.3/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create'
wasifshareef commented 1 month ago

here are the schema files i am trying to use cat schemazeek.avsc { "type": "record", "name": "LogRecord", "fields": [ {"name": "_node_name", "type": "string"}, {"name": "ts", "type": "float"}, {"name": "uid", "type": "string"}, {"name": "id.orig_h", "type": "string"}, {"name": "id.orig_p", "type": "int"}, {"name": "id.resp_h", "type": "string"}, {"name": "id.resp_p", "type": "int"}, {"name": "id.vlan", "type": "float"}, {"name": "proto", "type": "string"}, {"name": "duration", "type": "float"}, {"name": "orig_bytes", "type": "float"}, {"name": "resp_bytes", "type": "float"}, {"name": "conn_state", "type": "string"}, {"name": "local_orig", "type": "boolean"}, {"name": "local_resp", "type": "boolean"}, {"name": "missed_bytes", "type": "int"}, {"name": "orig_pkts", "type": "int"}, {"name": "orig_ip_bytes", "type": "int"}, {"name": "resp_pkts", "type": "int"}, {"name": "resp_ip_bytes", "type": "int"}, {"name": "tailed_path", "type": "string"}, {"name": "beat.hostname", "type": "string"}, {"name": "tag", "type": "string"}, {"name": "clientkey", "type": "string"}, {"name": "consumer_key", "type": "string"}, {"name": "service", "type": "string"}, {"name": "history", "type": "string"}, {"name": "tunnel_parents", "type": "string"} ] }

++++++

{ "type": "record", "name": "LogRecord", "fields": [ {"name": "timestamp", "type": "string"}, {"name": "flow_id", "type": "int"}, {"name": "in_iface", "type": "string"}, {"name": "event_type", "type": "string"}, {"name": "src_ip", "type": "string"}, {"name": "src_port", "type": "float"}, {"name": "dest_ip", "type": "string"}, {"name": "dest_port", "type": "float"}, {"name": "proto", "type": "string"}, {"name": "snmp", "type": "string"}, {"name": "beat.hostname", "type": "string"}, {"name": "tag", "type": "string"}, {"name": "clientkey", "type": "string"}, {"name": "consumer_key", "type": "string"}, {"name": "alert", "type": "string"}, {"name": "flow", "type": "string"}, {"name": "payload_printable", "type": "string"}, {"name": "stream", "type": "float"}, {"name": "app_proto", "type": "string"}, {"name": "netflow", "type": "string"}, {"name": "tcp", "type": "string"}, {"name": "icmp_type", "type": "float"}, {"name": "icmp_code", "type": "float"} ] }

Thanks,