Closed gtie closed 1 year ago
I wonder would upgrading regex
dependency help with this. Currently we the version in Cargo.lock
is 1.3.4 and the latest version on crates.io is 1.3.5.
Thanks, @gtie and apologies for the trouble. Nothing jumps out to me, and as @a-rodin pointed out, I think it is worth upgrading our regex
dependency. @a-rodin does anything jump out to you in the regex
changelog?
That said, I think we should consider setting this up as a reliability test in our test harness. Reliability tests run indefinitely and report on errors. Maybe we can reproduce the error that way. @gtie would you be willing to share a sample log file that we could run this test against? Otherwise, we'll use the 2 lines listed above.
The only significant new change in regex
crate is adding support for Unicode 13.
However, I tried to reproduce the the issue using stdin
source and console
sink with the two lines listed above and they were parsed successfully. Because of that I think it is possible that the issue is caused by file
source rather than by regex
transform. It might have something to do with file writes being flushed without padding over line delimiters, so that sometimes the produced log lines are truncated, and thus cannot be parsed.
@binarylogic, unfortunately, the full log files have too much sensiteve data to be shared. I'd suspect that having a file source + the above regex transform with a few M lines per day would be enough to have this triggered in a day or so.
@a-rodin, it does indeed make sense for file
to be the culprit here, feeding regex with incomplete lines. Checking the WARN message closely reveals that the beginning of the line (the tai64n timestamp) is not actually printed in this case - the printout starts from the second field.
Perhaps a good start would be to increase the length of the line printed out in the WARN message, to confirm what is the string that acually makes it to the regex transform?
P.S. Thanks a lot for your quick review!
FYI, this is still an issue even with 0.10.0, happening a couple of times per hour:
Aug 22 03:18:17 m82 vector[22916]: Aug 22 03:18:17.907 WARN transform{name=tinydns-anycast-regex type=regex}: vector::internal_events::regex: regex pattern failed to match. field="@400000005f408e8335eeb1ac 2620011900300000000000000" rate_limit_secs=30
Aug 22 03:18:17 m82 vector[22916]: Aug 22 03:18:17.907 WARN transform{name=tinydns-anycast-regex type=regex}: vector::internal_events::regex: "regex pattern failed to match." is being rate limited. rate_limit_secs=5
Aug 22 03:28:57 m82 vector[22916]: Aug 22 03:28:57.687 WARN transform{name=tinydns-anycast-regex type=regex}: vector::internal_events::regex: 2 "regex pattern failed to match." events were rate limited. rate_limit_secs=5
Aug 22 03:47:50 m82 vector[22916]: Aug 22 03:47:50.784 INFO source{name=tinydns-anycast-file type=file}:file_server: file_source::file_server: Found file to watch. path="/var/log/tinydns-anycast/current" file_position=0
Aug 22 04:17:59 m82 vector[22916]: Aug 22 04:17:59.581 WARN source{name=tinydns-anycast-file type=file}:file_server: file_source::file_server: Ignoring file smaller than fingerprint_bytes file="/var/log/tinydns-anycast/current"
Aug 22 04:18:00 m82 vector[22916]: Aug 22 04:18:00.769 INFO source{name=tinydns-anycast-file type=file}:file_server: file_source::file_server: Found file to watch. path="/var/log/tinydns-anycast/current" file_position=0
Aug 22 04:47:09 m82 vector[22916]: Aug 22 04:47:09.015 INFO source{name=tinydns-anycast-file type=file}:file_server: file_source::file_server: Found file to watch. path="/var/log/tinydns-anycast/current" file_position=0
Aug 22 05:16:31 m82 vector[22916]: Aug 22 05:16:31.345 INFO source{name=tinydns-anycast-file type=file}:file_server: file_source::file_server: Found file to watch. path="/var/log/tinydns-anycast/current" file_position=0
Aug 22 05:41:19 m82 vector[22916]: Aug 22 05:41:19.444 WARN transform{name=tinydns-anycast-regex type=regex}: vector::internal_events::regex: 1 "regex pattern failed to match." events were rate limited. rate_limit_secs=5
Aug 22 05:46:58 m82 vector[22916]: Aug 22 05:46:58.148 WARN transform{name=tinydns-anycast-regex type=regex}: vector::internal_events::regex: regex pattern failed to match. field="@400000005f40b15c0" rate_limit_secs=30
The times of the WARN events (the errors) generally differ from the INFO events (file rotations). I suspect this might be the result of reads done before the full line is flushed to disk.
Since the file sink has max_line_bytes
parameter, it should be possible to have at least a band-aid solution by adding "MIN_line_bytes" - i.e. do not consider this line full, even if you encouter EOF, while it is below that number of bytes.
Another possible solution would be to "ignore" EOF as a line separator - insist on reading until an EOL, or until file rotates.
@gtie just updating on this. I think the best solution here is to finish the dead-letter queue work so you can inspect events that fail. And the regex_parser
transform is being replaced with the new remap
transform (the parse_regex
function specifically).
Can confirm this is still an issue with 0.18.1.
2021-12-17T15:07:19.383312Z WARN vector::internal_events::regex_parser: Regex pattern failed to match. field="an upstream response is buffered to a temporary file /v[...]" internal_log_rate_secs=30
Perhaps either the regex_parser transform should be removed (sad) or fixed?
As others have mentioned, I think the issue here is not the regex parsing, but rather getting a partial line from the file
source.
@splitice are you also using a file
source?
Not file source, vector source.
Not file source, vector source.
What is the original source of the data that is being forwarded over the vector
source/sink?
UDP syslog
P.s am aware of nginx parsing functions (at the moment just porting over existing pipeline, will look into improvements like that later)
I see, thanks! It sounds like your issue may be different than the one reported here. In this issue it seems like Vector may be reading partial lines from the file
source. In your case, it could be that the events are being truncated to fit into UDP packet length? Are you confident the packets have the full lines in them? Feel free to open a separate issue with more details.
As noted in this issue, the regex_parser
transform, was replaced by remap
. You can leverage that to get more details about the message that failed to parse with VRL like:
parsed, err = parse_regex(r'some regex', .message)
if err != null {
log(.) // log the event that failed to parse
}
@jszwedko aware (and that's what I've been doing over the past week). Although I didnt think of ever doing log(.)
neat.
regex_parser has one feature though thats very handy, sequential array execution. When ported to remap we ended up with ~30 ??
seperated parse_regex
calls for the same effect which reads a bit less optimally.
In your case, it could be that the events are being truncated to fit into UDP packet length?
It may be worth noting that Vector now drops the packet if a UDP event is truncated. I wonder if this means the problem no longer occurs here.
I am going to close this issue because:
There may still be an issue with the file
source. If this crops up again we will either reopen this issue or create a new issue specifically for the file source.
Occasionally (on a million or two log lines), Vector would reporting an error like:
The log line that it is getting stuck on is perfectly formatted in the log file. Due to specifics of our setup, each user query results in two log entries with timestamp being the only difference between them. One of the log lines makes it out to the final sink, and the other generates the error above (and is lost).
The twin logs lines looke like this:
The vector.toml source/transform configuration that is processing them looks like this:
FWIW, the errors/missed lines occur hours away from the rotation of the input file (
/var/log/tinydns-anycast/current
in this case).Any ideas why the transform might be failing and how to avoid those failures?