Closed Xaaame closed 3 years ago
I don't completely follow your question, but if you want the logstash input plugin to read and monitor only some of the directory or files in your container you can use path_filters, but that is a input configuration. With path_filters it's possible to only process all log and text files in all directories by setting ['*/.log','*/.txt'], it will then skip processing other files.
The configurations parameter path_filters is a direct copy from the original input azureblob plugin that didn't scale well for me so I wrote my version plugin. But I added the path_filters parameters that they have. https://github.com/Azure/azure-diagnostics-tools/tree/master/Logstash/logstash-input-azureblob#optional-parameters
Technically it uses the JRuby File.fnmatch to do the filtering. The examples in there may help you to do complicated filtering. https://ruby-doc.org/core-2.5.5/File.html#method-c-fnmatch-3F
The input plugin reads the files and if codec is set to line, every line is sent internally as json inside a "message" to the filter like this
{"message": "line1 found 7 things in file interestingfile.txt"}
With grok you can then find take the message field that contains the whole event and map the line into the variables your interested in ...
grok {
match => ['message', '%{WORD:line} found %{NUMBER:errors:int} things in file %{WORD:filename}']
}
which will result in a json sent to the output plugin that looks like this
{ "message":
{ "line": "line1",
"errors": 7,
"filename": "interestingfile.txt"
}
}
But I only created the azure_blob_storage input plugin, not the magic of the grok filter
My question is: how can I get the name of each processed file, by adding a field in the message for example ? I want to display the data in kibana for each file and not for all the files contained in the blob.
When we use an local input, we do this for example: grok { match => [ "path"; "%{GREEDYDATA:filename}" ] } But there is no "path" variable in input of the plugin so I do not have the means to recover the name of the files contained in the blob.
Thx for your help
Now I understand. the filename is available in a variable 'name'. But the message goes through the decorator without the filename and than into the queue. I don't have much time, but when I do I can try to add an option to put the filename in a meta, than you can have access to the filename in the filterblock
So I'll add something like this event.set('filename', name) Just after the decorate
https://www.elastic.co/guide/en/logstash/7.9/input-new-plugin.html
Can you write a message in this issue when you create this ?
Thx a lot !
0.11.5 has been pushed it has a new option to set addfilename => true which will add the whole filepath of the processed files.
addfilename => true should do the trick
Hello !
I have a question for you : I need to filter my data by the name of my different files, how can I do this ? I see in the TODO list :
show file path in logger
add filepath as part of log message
So I don't think there is a option to solve my problem but maybe I can succeed by an other way, I have already try to grok the path_filters like: grok { match => [ "path_filters", "%{GREEDYDATA:filename}"] } But this is not conclusive
Thx