folbricht / routedns

DNS stub resolver, proxy and router with support for DoT, DoH, DoQ, and DTLS
BSD 3-Clause "New" or "Revised" License
473 stars 62 forks source link

error-----failed to decode query #257

Closed liang-hiwin closed 2 years ago

liang-hiwin commented 2 years ago

I see such an error message through the systemctl status routedns command.

Aug 15 23:09:46 idns routedns[18131]: time="2022-08-15T23:09:46+08:00" level=error msg="failed to decode query" addr=":8053" client="ip address:40080" error="dns: bad rdata" id=local-doq protocol=doq
Aug 15 23:09:46 idns routedns[18131]: time="2022-08-15T23:09:46+08:00" level=error msg="failed to decode query" addr=":8053" client="ip address:34424" error="dns: bad rdata" id=local-doq protocol=doq
Aug 15 23:09:46 idns routedns[18131]: time="2022-08-15T23:09:46+08:00" level=error msg="failed to decode query" addr=":8053" client="ip address:36051" error="dns: bad rdata" id=local-doq protocol=doq
Aug 15 23:09:46 idns routedns[18131]: time="2022-08-15T23:09:46+08:00" level=error msg="failed to decode query" addr=":8053" client="ip address:36051" error="dns: bad rdata" id=local-doq protocol=doq
Aug 15 23:09:46 idns routedns[18131]: time="2022-08-15T23:09:46+08:00" level=error msg="failed to decode query" addr=":8053" client="ip address:36051" error="dns: bad rdata" id=local-doq protocol=doq
Aug 15 23:09:47 idns routedns[18131]: time="2022-08-15T23:09:47+08:00" level=error msg="failed to decode query" addr=":8053" client="ip address:38873" error="dns: bad rdata" id=local-doq protocol=doq
Aug 15 23:09:47 idns routedns[18131]: time="2022-08-15T23:09:47+08:00" level=error msg="failed to decode query" addr=":8053" client="ip address:43486" error="dns: bad rdata" id=local-doq protocol=doq
Aug 15 23:09:47 idns routedns[18131]: time="2022-08-15T23:09:47+08:00" level=error msg="failed to decode query" addr=":8053" client="ip address:43486" error="dns: bad rdata" id=local-doq protocol=doq
Aug 15 23:09:47 idns routedns[18131]: time="2022-08-15T23:09:47+08:00" level=error msg="failed to decode query" addr=":8053" client="ip address:43486" error="dns: bad rdata" id=local-doq protocol=doq
Aug 15 23:09:48 idns routedns[18131]: time="2022-08-15T23:09:48+08:00" level=error msg="failed to decode query" addr=":8053" client="ip address:41596" error="dns: bad rdata" id=local-doq protocol=doq
liang-hiwin commented 2 years ago

This is the configuration for dns-over-quic

[listeners.local-doq]
address = ":8053"
protocol = "doq"
resolver = "cloudflare-rrl"
server-crt = "/opt/domain.crt"
server-key = "/opt/domain.key"
liang-hiwin commented 2 years ago

Test normal parsing through dnslookup v1.5.1 (https://github.com/ameshkov/dnslookup) command

PS C:\WINDOWS\system32> dnslookup www.ksksy.com  quic://domain.com:8053
dnslookup v. v1.5.1
dnslookup result:
;; opcode: QUERY, status: NOERROR, id: 39672
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; QUESTION SECTION:
;www.ksksy.com. IN       A

;; ANSWER SECTION:
www.ksksy.com.  6       IN      A       103.110.61.36

;; ADDITIONAL SECTION:

;; OPT PSEUDOSECTION:
; EDNS: version 0; flags: ; udp: 4096
; SUBNET: 1xx.0xx.xx.xxx/24/0
liang-hiwin commented 2 years ago

linux debian 10 x64 dnslookup v. v1.7.1 test

root@dns2:~# dnslookup www.ksksy.com  quic://domain.com:8053
dnslookup v. v1.7.1
2022/08/15 23:30:34 failed to sufficiently increase receive buffer size (was: 208 kiB, wanted: 2048 kiB, got: 416 kiB). See https://github.com/lucas-clemente/quic-go/wiki/UDP-Receive-Buffer-Size for details.
2022/08/15 23:30:34 Cannot make the DNS request: reading response from quic://domain.com:8053: EOF
liang-hiwin commented 2 years ago

Could it be a dnslookup problem? I also submitted issues in this project https://github.com/ameshkov/dnslookup/issues/30

liang-hiwin commented 2 years ago

Increase sysctl -w net.core.rmem_max=2500000 in this way, the test is invalid

folbricht commented 2 years ago

Ah, so there's a new prefix now, I need to look into the 1.0 standard for doq again

folbricht commented 2 years ago

There's now a proposed fix on the issue-257 branch. Would you be able to try that?

liang-hiwin commented 2 years ago

There's now a proposed fix on the issue-257 branch. Would you be able to try that?

ok, I'll wait to pull down the code and compile it

liang-hiwin commented 2 years ago

There's now a proposed fix on the issue-257 branch. Would you be able to try that?

There is this error message----"failed to read query", but the error message of dnslookup has been solved, https://github.com/folbricht/routedns/issues/257#issuecomment-1215170614


Aug 16 19:04:04 idns routedns[10037]: time="2022-08-16T19:04:04+08:00" level=error msg="failed to read query" addr=":8053" client="ip address:3890" error="unexpected EOF" id=local-doq protocol=doq
Aug 16 19:04:04 idns routedns[10037]: time="2022-08-16T19:04:04+08:00" level=error msg="failed to read query" addr=":8053" client="ip address:3890" error="unexpected EOF" id=local-doq protocol=doq
folbricht commented 2 years ago

Hmm, so I'm using example-config/doq-listener.toml with the latest version of dnslookup and it seems to be working

VERIFY=0 dnslookup www.google.com quic://127.0.0.1:8853
INFO[0000] starting listener                             addr=":8853" id=local-doq protocol=doq
TRAC[0002] started connection                            addr=":8853" id=local-doq protocol=doq
TRAC[0002] accepting incoming connection                 addr=":8853" client="127.0.0.1:38148" id=local-doq protocol=doq
TRAC[0002] opening stream                                addr=":8853" client="127.0.0.1:38148" id=local-doq protocol=doq stream=0
DEBU[0002] received query                                addr=":8853" client="127.0.0.1:38148" id=local-doq protocol=doq qname=www.google.com.
DEBU[0002] querying upstream resolver                    client=127.0.0.1 id=cloudflare-dot protocol=dot qname=www.google.com. qtype=A resolver="1.1.1.1:853"
TRAC[0002] opening connection                            addr="1.1.1.1:853"
TRAC[0002] sending query                                 addr="1.1.1.1:853" qname=www.google.com.
TRAC[0002] closing stream                                addr=":8853" client="127.0.0.1:38148" id=local-doq protocol=doq stream=0
TRAC[0004] closing connection                            addr=":8853" id=local-doq protocol=doq
TRAC[0012] connection terminated by idle timeout         addr="1.1.1.1:853"
liang-hiwin commented 2 years ago

Hmm, so I'm using example-config/doq-listener.toml with the latest version of dnslookup and it seems to be working

VERIFY=0 dnslookup www.google.com quic://127.0.0.1:8853
INFO[0000] starting listener                             addr=":8853" id=local-doq protocol=doq
TRAC[0002] started connection                            addr=":8853" id=local-doq protocol=doq
TRAC[0002] accepting incoming connection                 addr=":8853" client="127.0.0.1:38148" id=local-doq protocol=doq
TRAC[0002] opening stream                                addr=":8853" client="127.0.0.1:38148" id=local-doq protocol=doq stream=0
DEBU[0002] received query                                addr=":8853" client="127.0.0.1:38148" id=local-doq protocol=doq qname=www.google.com.
DEBU[0002] querying upstream resolver                    client=127.0.0.1 id=cloudflare-dot protocol=dot qname=www.google.com. qtype=A resolver="1.1.1.1:853"
TRAC[0002] opening connection                            addr="1.1.1.1:853"
TRAC[0002] sending query                                 addr="1.1.1.1:853" qname=www.google.com.
TRAC[0002] closing stream                                addr=":8853" client="127.0.0.1:38148" id=local-doq protocol=doq stream=0
TRAC[0004] closing connection                            addr=":8853" id=local-doq protocol=doq
TRAC[0012] connection terminated by idle timeout         addr="1.1.1.1:853"

yes, through the systemctl status routedns command. Prompt the error below

Aug 16 19:04:04 idns routedns[10037]: time="2022-08-16T19:04:04+08:00" level=error msg="failed to read query" addr=":8053" client="ip address:3890" error="unexpected EOF" id=local-doq protocol=doq
Aug 16 19:04:04 idns routedns[10037]: time="2022-08-16T19:04:04+08:00" level=error msg="failed to read query" addr=":8053" client="ip address:3890" error="unexpected EOF" id=local-doq protocol=doq
folbricht commented 2 years ago

This looks like whatever is sending the query is not using the prefix that's now required as per rfc. Does dnslookup fail for you? I used the latest version of dnslookup to test

liang-hiwin commented 2 years ago

This looks like whatever is sending the query is not using the prefix that's now required as per rfc. Does fail for you? I used the latest version of dnslookup to testdnslookup

The dnslookup test is normal, but the adguard android version (https://adguard.com/en/adguard-android/overview.html) is used to add an error.

folbricht commented 2 years ago

Is that an android client? Can you explain a bit how that's all linked together? Is it like this?

Adguard Client (android) ---> routdns ---> upstream dns

I wonder if it's got the same issue with not sending the prefix that's specified in the final DoQ RFC.

liang-hiwin commented 2 years ago

Is that an android client? Can you explain a bit how that's all linked together? Is it like this?

Adguard Client (android) ---> routdns ---> upstream dns

I wonder if it's got the same issue with not sending the prefix that's specified in the final DoQ RFC.

yes, this is an Android version of the app. This app supports the function of filling in encrypted dns. When I fill in for example quic://domain.com (implemented through routedns), it prompts that the connection cannot be made, but through the dnslookup test, it is a normal resolution of.

charlieporth1 commented 2 years ago

@frank there was a recent upgrade to quic and it has become a standard thus quic-go was upgraded v0.28.1. I'm not sure if that is the underlying effect of this problem

charlieporth1 commented 2 years ago

@frank there was a recent upgrade to quic and it has become a standard thus quic-go was upgraded v0.28.1. And if you understand c/++ https://github.com/AdguardTeam/DnsLibs is what Adguard uses in there mobile apps

folbricht commented 2 years ago

Yeah, we have that quic-go upgrade already, but I suspect the issue is a recent change to DoQ itself where the length prefix has to be sent with every query/response (implemented in this PR: https://github.com/folbricht/routedns/pull/258). It's quite likely there's a mismatch here. https://github.com/AdguardTeam/DnsLibs/commit/1fb704ec77782145c4b4df5f5be8a09e4edea7c9 updates the lib to use the prefix as well, but was only merged a couple months ago. Not sure it's been updated everywhere yet.

@thb007 I suspect that the library you're using on Android hasn't been updated. The latest I could find is from April, and the change above was made in June. There's a good chance you may be able to use the older version of routedns (on the master branch) with https://github.com/folbricht/routedns/blob/master/cmd/routedns/example-config/doq-listener.toml though this will of course break dnslookup.

@charlieporth1 https://github.com/folbricht/routedns/pull/258 updates routedns to use the latest RFC changes, it adds the prefix as required, this fixes dnslookup but it's of course incompatible with anything else that hasn't been updated yet. I think we should merge that anyway. Other servers and clients will have to be updated

charlieporth1 commented 2 years ago

@folbricht is it ready for a review? I'm got with a merger if you are. Would it not be possible to do a if error->backwards compatibility clause. The idea is to support the old standard if anyone is really worried about it

folbricht commented 2 years ago

@charlieporth1 It's complete, but I haven't had a chance to test against a DoQ server (that's not routedns). Do you happen to have one we can try it with?

charlieporth1 commented 2 years ago

I'm working on very simple app that uses that lib so I can do that. FYI if you download Adguard on an Android phone it would work for that without charge you get 3GB free and you would be able to set to your quic server. I don't know about iOS tho

liang-hiwin commented 2 years ago

Yeah, we have that quic-go upgrade already, but I suspect the issue is a recent change to DoQ itself where the length prefix has to be sent with every query/response (implemented in this PR: #258). It's quite likely there's a mismatch here. AdguardTeam/DnsLibs@1fb704e updates the lib to use the prefix as well, but was only merged a couple months ago. Not sure it's been updated everywhere yet.

@thb007 I suspect that the library you're using on Android hasn't been updated. The latest I could find is from April, and the change above was made in June. There's a good chance you may be able to use the older version of routedns (on the master branch) with https://github.com/folbricht/routedns/blob/master/cmd/routedns/example-config/doq-listener.toml though this will of course break dnslookup.

@charlieporth1 #258 updates routedns to use the latest RFC changes, it adds the prefix as required, this fixes dnslookup but it's of course incompatible with anything else that hasn't been updated yet. I think we should merge that anyway. Other servers and clients will have to be updated

The routedns I am using now is compiled after merging https://github.com/folbricht/routedns/pull/258, and my adguard is the latest nightly version (https://adguard.com/en/beta.html )

charlieporth1 commented 2 years ago

是的,我们已经进行了快速升级,但我怀疑问题是最近对 DoQ 本身的更改,其中长度前缀必须随每个查询/响应发送(在此 PR:#258中实现)。这里很可能存在不匹配。AdguardTeam/DnsLibs@ 1fb704e 也更新了库以使用前缀,但仅在几个月前合并。不确定它是否已在所有地方更新。

@thb007我怀疑您在 Android 上使用的库尚未更新。我能找到的最新的是 4 月,上面的更改是在 6 月进行的。您很有可能可以通过https://github.com/folbricht/routedns/blob/master/cmd/routedns/example-config/doq-listener 使用旧版本的 routedns(在 master 分支上)。 toml虽然这当然会打破dnslookup

@charlieporth1 #258更新 routedns 以使用最新的 RFC 更改,它根据需要添加前缀,此修复dnslookup但它当然与尚未更新的任何其他内容不兼容。我认为无论如何我们都应该合并它。其他服务器和客户端必须更新

Yeah, we have that quic-go upgrade already, but I suspect the issue is a recent change to DoQ itself where the length prefix has to be sent with every query/response (implemented in this PR: #258). It's quite likely there's a mismatch here. AdguardTeam/DnsLibs@1fb704e updates the lib to use the prefix as well, but was only merged a couple months ago. Not sure it's been updated everywhere yet.

@thb007 I suspect that the library you're using on Android hasn't been updated. The latest I could find is from April, and the change above was made in June. There's a good chance you may be able to use the older version of routedns (on the master branch) with https://github.com/folbricht/routedns/blob/master/cmd/routedns/example-config/doq-listener.toml though this will of course break dnslookup.

@charlieporth1 #258 updates routedns to use the latest RFC changes, it adds the prefix as required, this fixes dnslookup but it's of course incompatible with anything else that hasn't been updated yet. I think we should merge that anyway. Other servers and clients will have to be updated

The routedns I am using now is compiled after merging https://github.com/foliricht/routedns/pull/258, and my adguard is the latest nightly version (https://adguard.com/en/beta.html )

@thb007 Where you able to test that branch with Adguard?

liang-hiwin commented 2 years ago

@charlieporth1 Yes, I purchased the official version of adguard, if I build quic with the dnsproxy of the adguard project, there is no problem.

charlieporth1 commented 2 years ago

@charlieporth1 Yes, I purchased the official version of adguard, if I build quic with the dnsproxy of the adguard project, there is no problem.

Perfect. I'll review the pr

liang-hiwin commented 2 years ago

@charlieporth1 Yes, I purchased the official version of adguard, if I build quic with the dnsproxy of the adguard project, there is no problem.

Perfect. I'll review the pr

This project builds quic without any problems, https://github.com/AdguardTeam/dnsproxy

liang-hiwin commented 2 years ago

@charlieporth1 Yes, I purchased the official version of adguard, if I build quic with the dnsproxy of the adguard project, there is no problem.

Perfect. I'll review the pr

thanks

charlieporth1 commented 2 years ago

@charlieporth1 Yes, I purchased the official version of adguard, if I build quic with the dnsproxy of the adguard project, there is no problem.

Perfect. I'll review the pr

thanks

Let a comment. Waiting until further notice