Open tguenneguez opened 1 week ago
Je me retrouve avec le même besoin de devoir récupérer le nombre d'occurrence sur une pattern recherchée. Je suis intéressé pour cette évolution
Could we please stick to English here so everyone can participate in the discussion!?
@tguenneguez would it make sense to extend the grok parser to be able to do this? The reason is that there are already many predefined patterns that can be used for matching... Furthermore, I wonder what the use case of not_match_count
is?
my point of view on the proposals 2 questions :
would it make sense to extend the grok parser to be able to do this ? The grok parser is very handy but it assumes that the content of the file respects a precise format known in advance. If you simply want to know the number of lines that contain KO, you must:
The reason is that there are already many predefined patterns that can be used for matching... Furthermore, I wonder what the use case of not_match_count is? if you know that an exe return in normal time a only line like : "treatment OK", you would know if there is lines that not match this string.
For a typical user (not a telegraf expert, or a developer of the solution), it is almost impossible to implement this system and make it work.
@tguenneguez let me address some things you assume:
If you simply want to know the number of lines that contain KO, you must
No, that's wrong. Grok uses regular expression just like what you've shown in your initial post. With grok you just do have the additional benefit of being able to use predefined patterns instead of having the need to come up with regexp for standard things.
if you know that an exe return in normal time a only line like : "treatment OK", you would know if there is lines that not match this string.
Yeah but you could also use a "not matching" regexp for exactly this. Why do you assume that someone in general would be interested in this? Alternatively, we could define a flag that generates a "remaining" metric output which sets a special value.
In my view we should have
[[inputs.file]]
files = ["example"]
data_format = "grok"
grok_named patterns = [
{ name = "2XX", pattern = " 2\d{2} " },
{ name = "3XX", pattern = " 3\d{2} " },
{ name = "4XX", pattern = ""%{WORD:method} %{PATH:path} HTTP/.?\..?" 4\d{2} " },
{ name = "5XX", pattern = ""%{WORD:method} %{PATH:path} HTTP/.?\..?" 5\d{2} " },
{ name = "default" }
]
which should result in
file,pattern=2XX value=0i
file,pattern=2XX value=0i
file,pattern=5XX method="GET",path="/test.html" value=0i
file,pattern=4XX method="GET",path="/login"
file,pattern=2XX value=0i
You then can aggregate over the methods and count the patterns if you wish. What do you think?
Use Case
Be abble to simply count number of line in a stream that match or not a pattern. I will developpe this plugin, but first I share the goal.
Sample of specification :
Pattern Parser Plugin
The
pattern
parser creates metrics from a stream containing lines. It counts number of lines matching a pattern.Configuration
Metrics
One metric is created for each search with tag "tag_name" contain "tag_value".
Examples
Config:
Input:
Output:
Config:
Input:
Output:
Expected behavior
Have a simple plugin to count lines that match a pattern.
Actual behavior
In fact, some use cases are possible by combining grok and aggregator, but it is very heavy to implement. It is also very difficult to configure these plugins well, especially with logs whose content is not precisely structured. For example, counting the words "Error" anywhere in a string. Finally, if no line matches, no value is returned.
Additional info
No response