Closed Tantalor93 closed 11 months ago
@chrisohaver if you could take a look at this PR when you have time 🙏
I'm trying to figure out why this fixes the issue.
Is it because rewrite causes plugin.NextOrFailure
to return an error or an empty message?
If so, why is that the case?
I'm trying to figure out why this fixes the issue.
Is it because rewrite causes
plugin.NextOrFailure
to return an error or an empty message? If so, why is that the case?
I'll try to describe what causes the issue:
let plugins be chained in a way that it forms this chain: prometheus <-> rrl <-> rewrite <-> forward
(maybe the issue could be mitigated by making rrl
plugin first?)
forward
fails the upstream resolution and returns dns.RcodeServerFailure, upstreamErr
rewrite
plugin returns dns.RcodeSuccess and error and writes rewritten response to the response writer
rrl
plugin passes nonWriter to the next plugin (which is rewrite
) and then checks err != nil || nw.Msg == nil from next plugin, which is true in this case (because err is not nil) and immediately returns without writing message to writer using w.WriteMsg(nw.Msg)
, therefore previous plugin (prometheus
) is not able to get rcode from message (rrl throws away the message)
prometheus
plugin checks rcode and gets default value, because correct message was not written by rrl
why this fixes the issue:
by only checking rcode from call plugin.NextOrFailure
in rrl
using plugin.ClientWrite, we check that the response was written by the next plugin. Whether the next plugin written the response should be indicated by status code returned by the next plugin, rrl should not care whether there was an error up the chain, only whether some response was written
(maybe the issue could be mitigated by making rrl plugin first?)
perhaps.
rrl should not care whether there was an error up the chain, only whether some response was written
I'm skeptical of the ClientWrite
function, because all it does IIRC is check the response code - it makes assumptions based on the response code, which is super kludgy and I'm not certain those assumptions are always correct.
I think the issue may be between forward and rewrite. if forward fails, it should not be writing the failure to the response writer, and should return a servfail code.
What is rewrite configured to do in your example?
Assuming rewrite is configured to rewrite the response code, then It should probably also clear the error.
(edit: specifically when rewriting an error code to a non-error code)
Ah ... it's due to this bit from rewrite...
// The next plugins didn't write a response, so write one now with the ResponseReverter.
// If server.ServeDNS does this then it will create an answer mismatch.
So, when rewrite is used, it always writes the error responses and doesn't leave it to the server.
Hello, when
rrl
plugin is combined withrewrite
plugin and upstream resolution fails, thenrrl
does not propagate rewritten response to the other plugins, for exampleprometheus
plugin.Actual impact can be observed using this example Corefile
then send a DNS query
this returns
then observe metrics
you will see, that failures are measured as NOERRORs instead of failures (SERVFAIL)
This PR adjusts the condition for checking whether the response was written in order to propagate error responses through plugin chain properly