Open azadi opened 3 years ago
Quick summary from the discussion on the IRC channel, for the record:
I would very much welcome any opinion on that topic, especially from people more familiar with the implications of the padding block size :-)
Thanks a lot for the nice report, by the way, much appreciated :+1:
Could add an option to Recursor to pad queries with ECS to the "wrong" size so that they'll be correct after dnsdist edits them.
Super arcane, but it could work...
Hm, tricky set of issues interlocking here!
I appreciate the desire to do as little work as possible during response handoff, but i also think introducing subtle differences in padding policy isn't a great outcome.
It sounds like the varying response size will leak whether ECS is enabled or not -- that's not a great outcome, as it means potentially distinguishing between different clients or configurations from response size alone. In combination with other potential leakage factors it lets an observer carve up the anonymity set further.
Some thoughts about different possible ways forward:
you could overwrite it with a second EDNS padding frame that is the same length as the ECS frame was
This is clever. It's a pity 7830 says
The "Padding" option MUST occur at most, once per OPT meta-RR (and hence, at most once per message).
and in hindsight that may have been overly strict.
Can you guarantee that the upstream will always place ECS and padding in adjacent positions, though? even if your own implementations do that, there's no guarantee that each stage of the forwarding chain here is from the same implementation, is it?
Yeah, I was just thinking of it as a best-effort possible-optimization.
I was pondering to overwrite the data fields of the ECS record to make it convey no more info instead of stripping it. That would something more of a "neuter" ECS than strip, but it would be cheap to do.
Alright, thanks for your input. It looks like we might need to re-pad, at least optionally.
When stripping the ECS extension, just adjust the padding extension by the appropriate size. This is the most obvious "right" thing to do, but it might be too much work, as you're trying to minimize processing during transfer. But has anyone quantified this concern? is the "work" cost that you're worried about memory throughput, CPU churn, or code complexity? Are we sure that doing the recalculation does in fact have a significant cost?
The way I see it, we have two options:
I was pondering to overwrite the data fields of the ECS record to make it convey no more info instead of stripping it. That would something more of a "neuter" ECS than strip, but it would be cheap to do.
Unfortunately rfc7871 prohibits sending an ECS option to a client that did not include one in its query (MUST NOT
) so that won't fly.
Is the relevant code not the part of pdns/dnsdist.cc
identified by @azadi above? Is that the "fast-path" or "slow-path"? Can you point to the code that would need to be changed?
I'm with you that adding complexity to pointer arithmetic in memory-unsafe languages is scary, but moving all padded queries to the slow path seems like a recipe for deployments to abandon privacy for performance's sake.
Do you have any significant fuzzing infrastructure set up, or access to any static analysis tools that could be run to try to identify problems with the fast-path?
Is the relevant code not the part of pdns/dnsdist.cc identified by @azadi above? Is that the "fast-path" or "slow-path"?
That is part of the relevant code, yes. The if (last)
branch is basically the fast-path, editing the payload in-place, while the else
counterpart not shown is one of the slow paths.
Can you point to the code that would need to be changed?
I have pushed a very ugly commit showing what would need to be done, albeit in a cleaner way: https://github.com/rgacogne/pdns/commit/ae19d135b20028db3c245e7f6f9b25c7db21a466
Do you have any significant fuzzing infrastructure set up, or access to any static analysis tools that could be run to try to identify problems with the fast-path?
Yes (OSS-fuzz, CI-fuzz) and yes (cppcheck
, clang's static analyzer, Coverity). Someone still needs to write that code, test it, add good unit and functional tests, though. It can be done, and likely will, but is currently not on my short list.
you could overwrite it with a second EDNS padding frame that is the same length as the ECS frame was
This is clever. It's a pity 7830 says
The "Padding" option MUST occur at most, once per OPT meta-RR (and hence, at most once per message).
and in hindsight that may have been overly strict.
Another clever variation on the idea: Instead of overwriting with EDNS padding option, use an alternate EDNS option. NSID might work since it allows arbitrary length. Or an option from the range reserved for experimentation.
That's clever but I'm worried it might be a bit brittle. We have users relying on dnsdist passing the NSID
value supplied by the backend, and I'm not sure we want to send a value from the experimentation range that might collide in some setups (although we could make it configurable).
I would concur with Remi that unexpected use or modification of NSID data would be non-optimal. The suggestion that an experimental range be used is probably also not optimal, but leads to the next thought: could an EDNS option be officially granted for use as padding?
On Thu, Oct 28, 2021 at 12:53 PM John Todd @.***> wrote:
I would concur with Remi that unexpected use or modification of NSID data would be non-optimal. The suggestion that an experimental range be used is probably also not optimal, but leads to the next thought: could an EDNS option be officially granted for use as padding?
There is already the EDNS padding option. Maybe we want a no-op/reflect option which can have an arbitrary length.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/PowerDNS/pdns/issues/10884#issuecomment-954026508, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADI6FDQK5Q3JDCEY45OIOOTUJF5ZJANCNFSM5GQ3KY6A . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
I'm wondering whether that MUST has any actual interop or privacy consequences, or if it's just an ill-considered strictness. I've raised the question on the dprive mailing list.
Short description
dnsdist (possibly) incorrectly changes the response message size without taking into account that the message may have been padded, when the message includes the EDNS client subnet option. This results in the message size not being padded to a multiple of 468 bytes, even if the backend server properly padded the message. This happens when EDNS client subnet is enabled in dnsdist (
useClientSubnet
).Environment
Steps to reproduce
The setup involves dnsdist with a pdns-recursor backend.
dnsdist.conf
with the following settings, notably, enablinguseClientSubnet
:Expected behaviour
The DNS response message should be padded to a multiple of 468 bytes, as seen in an example with 1.1.1.1, when TLS is used and the client pads the query.
(kdig has
+padding
set by default when+tls
is set.)Actual behaviour
On dnsdist 1.6.1 with pdns-recursor 4.5.5 and with padding enabled (as in the configuration above):
The message is 457 bytes and incorrectly padded, even though pdns-recursor returns the correct padded response:
Other information
useClientSubnet=true
in dnsdist (the default is disabled), the message is correctly padded:You can either disable
useClientSubnet
in dnsdist, or setuse-incoming-edns-subnet=no
in pdns-recursor, and the response message will be correctly padded.pdns/dnsdist.cc
, in function fixUpResponse:When the ECS option is removed, the response message size is changed and this does not seem to take into account that the message may have been padded and that
EDNSOptionCode::Padding
must be set?Just to confirm this, I added
vinfologs
to the above: assuming the 468 byte example (response.size()
, padded message from pdns-recursor), theexistingOptLen
being 423 bytes, after ECS is removed from OPToptLen
(412 bytes), that results in468 - (423 - 412) = 457
, which does seem to match the result we are getting, so I am assuming this is where the size is changed.Possible Solutions
Like I said, it's possible I am missing something here and I haven't spent a whole lot of time other than adding the
vinfologs
above to confirm my suspicion about where the size is being changed :) From https://github.com/PowerDNS/pdns/issues/10018#issuecomment-768109582, it seems like padding should be handled from pdns-recursor so nothing should be done on dnsdist's side?I think we should either re-pad the message after the ECS option is removed, or ensure that when it is removed that padding is taken into account, or if this is indeed the desired behaviour then maybe a flag needs to be added to dnsdist to override this?
Thanks for all your work on dnsdist and I am happy to provide more logs and do more debugging if desired.