Azure / azure-diagnostics-tools

Plugins and tools for collecting, processing, managing, and visualizing diagnostics data and configuration
98 stars 93 forks source link

Logstash-input-azureblob is not compatible with ES stack 7.8.0 #223

Open liualexiang opened 3 years ago

liualexiang commented 3 years ago

Hi Azure team,

I tested below conf works well on ES/Logstash 5.2.0, but it doesn't work on ES/Logstash 7.8.0

input {
   azureblob
   {
     storage_account_name => STORAGEACCOUNT_NAME
     storage_access_key => ACCESS_KEY
     container => "insights-logs-networksecuritygroupflowevent"
     codec => "json"
     file_head_bytes => 12
     file_tail_bytes => 2
 }
}

output {
  stdout {
   codec => rubydebug
  }
}

see error:

[ERROR] 2020-07-08 10:11:42.013 [[main]<azureblob] json - JSON parse error, original data now in message field {:error=>#<LogStash::Json::ParserError: Unexpected character (',' (code 44)): expected a valid value (number, String, array, object, 'true', 'false' or 'null')
 at [Source: (String)"{"records":[,{"time":"2020-07-08T10:09:47.4226493Z","systemId":"4e28db7f-03ce-4972-8e50-5181193e366b","macAddress":"000D3A7DBD7A","category":"NetworkSecurityGroupFlowEvent","resourceId":"/SUBSCRIPTIONS/5FB605AB-C16C-4184-8A02-FEE38CC11B8C/RESOURCEGROUPS/XIANGLIU_CSA/PROVIDERS/MICROSOFT.NETWORK/NETWORKSECURITYGROUPS/

Sometimes the error is:

[ERROR] 2020-07-08 10:20:11.968 [[main]<azureblob] logstashinputazureblob - Oh My, An error occurred. Error:undefined method `length' for nil:NilClass: Trace: ["/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-input-azureblob-0.9.13-java/lib/logstash/inputs/azureblob.rb:210:in `process'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-input-azureblob-0.9.13-java/lib/logstash/inputs/azureblob.rb:151:in `run'", "/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:345:in `inputworker'", "/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:336:in `block in start_input'"] {:exception=>#<NoMethodError: undefined method `length' for nil:NilClass>}

Sometimes is:

{
    "@timestamp" => 2020-07-08T10:19:40.308Z,
          "tags" => [
        [0] "_jsonparsefailure"
    ],
liualexiang commented 3 years ago

test VM version: ubuntu 18.04, the issue can be reproduced.

pinochioze commented 3 years ago

Hi liualexiang, I think there is something wrong with your log file, it may not be the JSON format. Can I have your logfile, I will test on my server?

liualexiang commented 3 years ago

Hi pinochioze@,

sorry for late response. you may try this log file: https://raw.githubusercontent.com/liualexiang/images/master/Azure_NSG_Logs.json

BTW: I did more tests these days, and I installed the logstash 5.2.0 and 7.8.0 on the same VM (CentOS 7.5 based with JAVA openJDK 1.8.0), using the same logstash configuration file, the logstash 5.2.0 works well with logstash-input-azureblob plugin, however, the jsonparse failed on logstash 7.8.0.

the test configuration I used.


input {
   azureblob
   {
     storage_account_name => STORAGE_ACCOUNT_NAME
     storage_access_key => STORAGE_ACCESS_KEY
     container => "insights-logs-networksecuritygroupflowevent"
     codec => "json"
     file_head_bytes => 12
     file_tail_bytes => 2
}
}
output {
  stdout {
   codec => rubydebug
  }
}
liualexiang commented 3 years ago

any insights?

chupark commented 3 years ago

The same issue to me and my logstash version is 7.8.0. The plugin always print comma behind the open brackets and sometimes load multiple blob. So i was replacing "[," to "[" and "}{" to "}^&&^{" then split "}^&&^{" by "^&&^"

liualexiang commented 3 years ago

The same issue to me and my logstash version is 7.8.0. The plugin always print comma behind the open brackets and sometimes load multiple blob. So i was replacing "[," to "[" and "}{" to "}^&&^{" then split "}^&&^{" by "^&&^"

Can you share how you replace the comma and open brackets? in the logstash input field?

chupark commented 3 years ago

The same issue to me and my logstash version is 7.8.0. The plugin always print comma behind the open brackets and sometimes load multiple blob. So i was replacing "[," to "[" and "}{" to "}^&&^{" then split "}^&&^{" by "^&&^"

Can you share how you replace the comma and open brackets? in the logstash input field?

You should do replace job in the filter filed not input. Anyway, my trick is not working completly. I think the plugin fetching blob incompletly.

The original data is :

{"records":[{"time":"2020-08-12T07:00:07.1267218Z","systemId":"5c9979d8-bb89-486f-adea-060bfe479aa2","macAddress":"0022480EDEF2","category":"NetworkSecurityGroupFlowEvent","resourceId":"/SUBSCRIPTIONS/58DBBD07-EB52-47CC-88B3-BBAEA99036A4/RESOURCEGROUPS/RG-CUSTOMERMONITORING/PROVIDERS/MICROSOFT.NETWORK/NETWORKSECURITYGROUPS/VM-CSMONITORING-NSG","operationName":"NetworkSecurityGroupFlowEvents","properties":{"Version":2,"flows":[{"rule":"DefaultRule_AllowInternetOutBound","flows":[{"mac":"0022480EDEF2","flowTuples":["1597215544,10.1.2.13,20.150.4.4,48330,443,T,O,A,E,8,1535,11,9612","1597215545,10.1.2.13,20.150.4.4,48362,443,T,O,A,B,,,,","1597215550,10.1.2.13,20.150.4.4,48342,443,T,O,A,E,8,1535,11,9612","1597215551,10.1.2.13,20.150.4.4,48370,443,T,O,A,B,,,,","1597215552,10.1.2.13,52.231.32.42,49518,

But the fetched data is :

>"{\"records\":[,\",\"1597854824,168.63.129.16,10.240.0.35,61213,32744,T,I,A,B,,,,\",\"1597854824,10.240.0.7,10.240.0.46,50747,53,U,I,A,B,,,,\",\"1597854824,10.240.0.7,10.240.0.41,42021,53,U,I,A,B,,,,\",\"1597854824,10.240.0.7,10.240.0.46,40744,53,U,I,A,B,,,,\",\"1597854824,10.240.0.7,10.240.0.41,46588,53,U,I,A,B,,,,\",\"1597854824,10.240.0.7,10.240.0.46,44871,53,U,I,A,B,,,,\",\"1597854824,10.240.0.20,10.240.0.45,55026,3001,T,I,A,B,,,,\",\"1597854824,10.240.0.7,10.240.0.46,42116,53,U,I,A,B,,,,\",\"1597854824,10.240.0.7,10.240.0.41,34235,53,U,I,A,B,,,,\",

See the data behind the "records". There is something missing