AdguardTeam / dnsproxy

Simple DNS proxy with DoH, DoT, DoQ and DNSCrypt support
Apache License 2.0
2.37k stars 246 forks source link

DNS-over-QUIC issues behind NAT #260

Closed Blankwonder closed 2 years ago

Blankwonder commented 2 years ago

Hi. I'm the author of the Surge app, which might be one of the most popular custom DNS clients on iOS and macOS.

I have implemented the DoQ support in the latest version. But I found a severe issue. While using DoQ behind a NAT, an IPv4 cellular network usually, the client's UDP source port may change due to a very short UDP NAT timeout. After the port changes, the dnsproxy server still sends packets to the old port number.

Here is a packet capture sample (QUIC decrypted, dnsproxy version: v0.43.1).

The client sent the query with source port 8958

Screen Shot 2022-08-13 at 10 33 21 PM

But the server sent the answer to port 8947, which was used previously.

Screen Shot 2022-08-13 at 10 34 08 PM

BTW. I'm a newbie for QUIC. Please let me know if I missed something. Thanks.

Blankwonder commented 2 years ago

@crossutility plz lecture him what to do.

If you mean the other app that implemented the DoQ support, I think I might know why the issue doesn't affect it. After some observation I found it re-initializes the QUIC session just after 30 seconds of idle.

Screen Shot 2022-08-14 at 12 31 27 AM Screen Shot 2022-08-14 at 12 29 30 AM

This is against the QUIC design principle. Frequently handshakes will slow down the DNS lookup delay, even slower than DNS-over-HTTPS, so what's the point of using the DoQ? Of course, most users won't notice a tiny bit of additional delay during handshakes. So it is a reasonable workaround but it makes DoQ useless.

we11adam commented 2 years ago

Please do realize that attempting to lecture people is very rude in the first place.

image

Re-establishing connections do waste tiny little bit of resources, but those are negligible. Real problem lies behind the NAT firewall. Firewall blocked the use traffic, previous connection turn invalid. Re-establishing the connection is the best way to resume. Is it appropriate for the famous author of the Surge app, which might be one of the most popular custom DNS clients on iOS and macOS to do the trash talking? It is rude to attack other people. I suggest brush your teeth and wash your mouth four times a day.

crossutility commented 2 years ago

In case some of our users might be misled by one of these replies, our app doesn't have a so-called "30 second idle" implementation. The reply did not specifically refer to our app, so we did not quote it.

By the way, when a connection is broken or closed by remote or any issue occurs, a new connection should be initiated. Which we tend to call it normal logic process, others may see it as a workaround. Either way this should be done.

crossutility commented 2 years ago

At last, it's normal and expected that each author has his or her considerations and implementations. It is very rude to lecture the other or to ask some one to lecture the other. And we don't like so-called fan thing, ours or others, the fan thing ruins everything.

tom-proxy commented 2 years ago

一我们的一些用户可能被这些回复之一误导,我们的应用程序没有所谓的“30 秒空闲”实现。回复没有具体提到我们的应用程序,所以我们没有引用它。

顺便说一句,当远程断开或关闭连接或发生任何问题时,应启动新连接。我们倾向于将其称为正常逻辑过程,其他人可能会将其视为一种解决方法。无论哪种方式,都应该这样做

给QX添加个小组件啊

EkkoG commented 2 years ago

It seems because quic-go does not implement connection migration https://github.com/lucas-clemente/quic-go/issues/234

Ref:

AdGuard Home 的问题是因为其依赖的上游项目 quic-go 禁用了 Connection Migration,而且 quic-go 在短期内似乎没有计划重新实施连接迁移。

Originally posted by @ZeroClover in https://github.com/shawn1m/overture/issues/269#issuecomment-1214322813

ameshkov commented 2 years ago

Well, to my knowledge there're no QUIC implementations that support connection migration yet.

After some observation I found it re-initializes the QUIC session just after 30 seconds of idle.

Another solution would be to enable KeepAlive on QUIC connection. This way it will send a "ping" packet every 20 seconds and keep the connection alive: https://github.com/AdguardTeam/dnsproxy/commit/401d87a3eb1159730f8963577037456b6ffe2d48

Blankwonder commented 2 years ago

Well, to my knowledge there're no QUIC implementations that support connection migration yet.

After some observation I found it re-initializes the QUIC session just after 30 seconds of idle.

Another solution would be to enable KeepAlive on QUIC connection. This way it will send a "ping" packet every 20 seconds and keep the connection alive: 401d87a

Oh, that's sad. Thanks for the clarification. It seems that the passive connection migration (PCM) is essential to QUIC under a cellular network.

BTW, QUIC implementations in other languages do support connection migration, such as ngtcp2, quiche-google, and quiche-cloudflare. Since the quic-go has no plan to support the connection migration yet, you may consider using another implementation. But it may be troublesome since they don't have a golang interface.

ameshkov commented 2 years ago

Oh, quiche have finally added it? This is good news

ag2s20150909 commented 2 years ago

https://github.com/cloudflare/quiche

https://github.com/google/quiche

Pure1ights commented 2 years ago

https://blog.cloudflare.com/the-road-to-quic/

buawf commented 2 years ago

Oh, quiche have finally added it? This is good news

Andrey,

Will there be any issue between Google quiche & Cloudflare quiche, as their Programming Languages are different ?

Google Quiche - C++ Cloudflare Quiche - Rust

ameshkov commented 2 years ago

Yep, we'll have to wait until it's added to quic-go (or maybe until Golang gets native QUIC implementation).

Potterli20 commented 2 years ago

Yep, we'll have to wait until it's added to quic-go (or maybe until Golang gets native QUIC implementation).

Still a problem with compilation, however, I used GO 1.19 IMG_20220817_090322.jpg