Closed rxwen closed 3 years ago
The problem is with dns resolution, I changed the sample to perform dns resolution as:
`
var sw = new Stopwatch()..start();
final ia = await InternetAddress.lookup("baidu.com");
sw.stop();
_counter = sw.elapsed.inMilliseconds;
` It costs like 5 seconds to run.
The automatic dns configuration is:
If I removed the IPv6 dns server manually, or add a custom dns server (8.8.8.8), the resolution speed is ok.
The dig command test showed the 192.168.1.1 server is ok, and the ipv6 server doesn't.
The router is a openwrt, it doesn't have upstream ipv6 connection.
Looks like when there are both ipv4 and ipv6 LAN connections on the iOS, the flutter performs dns resolution only with ipv6 (which should fail because there is no ipv6 upstream connection), but ignores the ipv4 dns server. Other applications still respect the ipv4 dns server setting, so they will not fail.
Capture network traffic on iPhone showed that whenever I asked the app to lookup ip address, it always started a AAAA type lookup and failed. The A type lookup only happened once with a success result, and the result should have been cached.
The InternetAddress.lookup method waited until the type AAAA resolution failed eventually after 5 seconds and then returned. Thought it should be able to return the cached result quicker.
And I'm not sure if the ipv6 ip address is the root cause. As a user of the app reported the slow behavior, while his iPhone didn't obtained a ipv6 address at all.
If https://github.com/dart-lang/sdk/blob/master/runtime/bin/socket_base_linux.cc#L202 is the code for iOS platform, looks like it's implemented on top of getaddrinfo function.
/cc @ZichangG
From your description, it is likely to be relative to our POSIX socket operations. In dart, we handle IOS and MacOS the same way with read
,write
and other POSIX sockets methods, which is discouraged by IOS man page.
@rxwen Can you verify that this happens only on IOS but not MacOS?
Some discussion: https://github.com/dart-lang/sdk/issues/41376
I'm not that familiar with macos app development. So a side question is, do I need to follow this guide (https://flutter.dev/desktop) to try a macos app?
We haven't received any report on android platform with this issue so far.
You'll need to run the same code on MacOS. Not so familiar with flutter setup either but your link looks good to me.
If you have dart
(which should be included in Flutter), you can just run dart path/to/your_program.dart
on Mac.
Android's implementation is quite similar but not enough to draw the conclusion. They use different eventhandlers and details are slightly different, which might contain a bug.
Sure. Will try when I'm in the reproducible environment and let you know.
My initial suspect is could it be possible that it's blocked here till timeout? If it's caused by the posix socket, does it mean that the flutter code is waiting for io_service's response on a socket without being able to return earlier?
My initial suspect is could it be possible that it's blocked here till timeout? If it's caused by the posix socket, does it mean that the flutter code is waiting for io_service's response on a socket without being able to return earlier?
It is possible. For lookup
, it is blocking. Connection will be issued after addresses
are resolved.
Checkout the implementation here: https://github.com/dart-lang/sdk/blob/5bf1223ea93138c55146840dc06277cf8068662f/sdk/lib/_internal/vm/bin/socket_patch.dart#L456
Is the original design of the lookup to be blocking in the case that the underlying dns has already got a ip for the domain name? Could the lookup be async and the slow response could simply update the dns cache when it eventually arrives?
Is the original design of the lookup to be blocking in the case that the underlying dns has already got a ip for the domain name?
As you pointed out, we wrapped info and passed into getaddrinfo
. The behavior is controlled by system call. You should check getaddrinfo
man page.
Could the lookup be async and the slow response could simply update the dns cache when it eventually arrives?
Apologies for using blocking
in previous comment. We do put it into a Future
so that it won't blocked. What I meant is following connections are concatenated to this Future
and will be executed afterwards. Once lookup
hangs, the following part will be delayed as well.
@rxwen Why is there an invalid IPv6 DNS server in your configuration? That seems like the problem. You're saying it's automatically configured? Maybe there's something wrong with your network and the addresses it's broadcasting? In this case, Dart is using the system getaddrinfo
function which is outside of our control and is supposed to do something reasonable. In that sense, this is not a problem with Dart, and we can't do anything.
Although, as @ZichangG mentions, there is another iOS networking API that we could use, and it might resolve addresses differently. It's still a problem that there's a broken server in the DNS configuration, but this other API may smooth over the problem with concurrent DNS lookup.
@sortie The reason I'm having a IPv6 DNS server is because my router supports ipv6, but my Internet port doesn't. So IPv6 works in LAN. I doubted its caused by my network configuration as well. But what I don't understand is, only the flutter apps (our pratical app or the simple demo I mentioned before) are affected, all other apps were working fine. And this is the only environment that I have access to that can reproduce the issue. As you may notice in my previous mentioned, some users of our app reported the same behavior (about 5 seconds delay for each network access), and the information I gathered from them are:
Since I can't do further investigation in users' environment. I have to use whatever I can gather so far to find out the cause. At least find out a explanation to these questions above. And is there possibility that the flutter app doesn't work well with some kind of iOS network configuration?
Also, I tried in a different network environment that is similar in that it also has a IPv6 Lan, but no IPv6 external ip address. I can confirm there is a IPv6 dns server configured in the same iPhone device. But in this environment, the flutter apps are not slow.
FWIW, Apple networking guidelines have the following section Avoid Resolving DNS Names Before Connecting to a Host, which rather aptly describes the problem we are hitting here.
Our HttpClient should probably perform two DNS resolutions in parallel (instead of doing a single getaddrinfo
hoping to resolve both IPv6 and IPv4) and then use whatever comes first (maybe with a some small timeout to accommodate for small delays).
@mraleph I thought the HttpClient might use InternessAddressType.Any internally. The type is passed to getaddrinfo, which was instructed to perform do two DNS resolutions in parallel.
@rxwen yeah, but I suspect that things hang because getaddrinfo
waits to get both IPv6 and IPv4 lookup results back. Could you try changing your sample test to do:
await InternetAddress.lookup("baidu.com", type: InternetAddressType.IPv4)
and see if it hangs just the same or returns result fast?
I already tried this before. It returned immediately.
@ZichangG I run the code below on mac:
import 'dart:io';
import 'dart:async';
import 'dart:core';
void main() async{
print("hello world!");
var ip = await InternetAddress.lookup("baidu.com");
print("$ip");
}
It returned immediately. But on a iPhone in the same wifi. It's still very slow. Does it give any hint to you?
I captured network package in the other network environment which also has a IPv6 LAN address. It looks like the AAAA dns query got a response soon, so the httpclient isn't blocked. It means the httpclient is indeed blocked due to the AAAA resolution timeout.
I'll try to get more information from our user in their environment and let you know.
Looks like getaddrinfo
is stuck.
There is probably no quick fixes except a shorter timeout.
Rewriting our IOS socket implementation with CFSocket
and CFHost
mechanism should probably work for this case. But as I mentioned in another case #41376, it also has some limitations on enabling VPN.
@ZichangG
There is probably no quick fixes except a shorter timeout.
There are two ways to look at this problem:
The bug is in the InternetAddress.lookup
with type: any
. It should have a shorter timeout.
The bug is in the _NativeSocket.startConnect
which issues InternetAddress.lookup
with type: any
. Remember that the original issue people are experiencing is that HttpClient
is very slow to connect when IPv6 network is somehow misconfigured - meaning that they don't really care about InternetAddress.lookup
which happens deep in the dart:io
implementation - they just want to pass hostname and let dart:io
handle the rest efficiently.
I suggest we look at the second interpretation. For that there is rather straightforward fix - we should slightly rework how _NativeSocket.startConnect
is implemented. Instead of issuing a single InternetAddress.lookup
with type: any
it should issue two separate lookups and use results of whatever comes first to start connecting. There is already logic inside that tries addresses one by one (if host has multiple addresses). The logic with multiple DNS lookups fits rather well into that.
From a user's perspective, the second suggested way sounds logic to me. As a user of HttpClient, we don't have to worry about the details. The interface of it is straight forward to us. I guess, if the change is made in underlying dart:io, other network applications like a plain tcp connection will be relieved too.
Would multiple DNS lookups actually solve the problem? We can ask for an IPv4/IPv6 address separately, yes, but that doesn't necessarily mean we use an a IPv4/IPv6 server unless iOS getaddrinfo
specifically does that. For instance, I imagine asking for an IPv4 could make iOS getaddrinfo
ask its IPv6 server first, and once it times out, then ask the IPv4 server and get a response.
Additionally if we did two concurrent lookups, which response do we prefer? The one that came first? The IPv6 one? The one the OS believes to have the best route? We'll be bypassing / reimplementing the system policy here.
In my opinion, this is a libc deficiency where getaddrinfo
doesn't cope with a broken IPv6 DNS server efffieciently. We can work around it, potentially, but we're not quite on the right level for this kind of policy, and it would be an imperfect workaround. We do want to have some sort of solution since this results in real user problems.
Since other iOS apps don't have this problem, it sounds like we're using the wrong API to do DNS lookups. If there's a higher level API we can use that takes care of abstracting these policies away, that would the best option for a clear solution.
Would multiple DNS lookups actually solve the problem?
Yes. I asked what happens if you specifically ask IPv4 address and I got the answer that it resolves quick.
Since other iOS apps don't have this problem, it sounds like we're using the wrong API to do DNS lookups.
My guess other apps are following the Apple guidelines to not resolve hostnames before connecting and letting library handle that. See link from this comment.
We can debate things back and forth - but I don't think it is a good spend of our time, when there is a rather small fix to apply now, and this fix would unbreak people.
After unbreaking people we can spend time contemplating perfect solution (e.g. large scale refactoring of dart:io to use non-POSIX API on iOS and Mac OS X).
Yes. I asked what happens if you specifically ask IPv4 address and I got the answer that it resolves quick.
Super. I missed that - just wanted to confirm that workaround worked. I'm on board with concurrent lookup then, and ideally switching to a higher level API.
@ZichangG do you think you could take a stub at changing _NativeSocket.startConnect
as described above?
Sure. But I don't know how to test on IOS.
While I'm working on the fix, I just realized InternetAddress
has a public lookup()
. If I do the hack for startConnect
, it will still hang if someone calls lookup
.
I think this concurrent lookup just makes the problem surface less if most users use HttpClient() or some other higher level apis.
I'm not sure whether we should proceed with this solution. Any idea? @sortie @mraleph
The workaround should be done as close as reasonably possible to the getaddrinfo call as it doesn't have anything to do with the socket connect code. I suggest making InternetAddress.lookup() on iOS do the lookup concurrently with two futures of the backend.
As I said before we have two ways to look at the problem (see https://github.com/dart-lang/sdk/issues/41451#issuecomment-613872570). I feel like just fixing Socket connections is going to cover a lot of issues. Most people don't really use InternetAddress.lookup
directly, instead they just use Socket.connect
or even more high level APIs like HttpClient
.
Futhermore, fixing InternetAddress.lookup
requires diverging from just using getaddrinfo
directly and making decisions around how long the timeout should be etc.
Compare this with fixing _NativeSocket.startConnect
- it already have logic inside which tries addresses one by one - adding logic there which would resolve hostnames concurrently seems totally natural.
Based on these two thoughts I think it is okay to fix _NativeSocket.startConnect
directly, rather than attempt to fix lookup
itself.
I have exactly the same issue... Post requests through http were taking from 7s to 20s+ That took a long way to debug and find out what's wrong, from glitchy circular progress indicator to this dns resolution thing... I hope you guys will be able to fix this soon!
In the meantime I've found kinda workaround that helped me to increase the http response speed. May be that would help someone who faced the same problem.
This workaround requires writing platform channel method and implementing http requests through native iOS URLSessionDataTask
. So now it takes 1-2s and that's the result I can afford! I can share the code if needed, it's pretty simple though.
@ivanryndyuk It's great if you can share the code.
I believe it should work around the issue, since the situation happens only within flutter app.
@rxwen OK, that's what I've got so far... just to get the idea should be enough.
In Flutter app Dart portion:
static const iosHttp = const MethodChannel('org.example/http');
Future<void> post(
String url,
Map<String, dynamic> body,
Map<String, String> headers,
) async {
if (Platform.isIOS) {
final result = await iosHttp.invokeMapMethod(
'post',
{
'url': url,
'body': body,
'headers': headers,
}
);
//todo: Do whatever you want with the result, it's a json response.
} else {
//todo: Send post request normally with `http` package.
}
}
and the iOS part in AppDelegate.swift
...
// Put this inside `application:didFinishLaunchingWithOptions:`
if let controller = window.rootViewController as? FlutterViewController {
let httpChannel = FlutterMethodChannel(
name: "org.example/http",
binaryMessenger: controller.binaryMessenger
)
httpChannel.setMethodCallHandler { [weak self] (call, result) in
if call.method == "post" {
self?.sendPostRequest(arguments: call.arguments, result: result)
} else {
result(FlutterMethodNotImplemented)
}
}
}
// Add this private method to AppDelegate
private func sendPostRequest(arguments: Any?, result: @escaping FlutterResult) {
do {
guard let params = arguments as? [String: Any] else {
throw FlutterError()
}
guard let urlString = params["url"] as? String else {
throw FlutterError()
}
guard let body = params["body"] else {
throw FlutterError()
}
guard let headers = params["headers"] as? [String: String] else {
throw FlutterError()
}
guard let url = URL(string: urlString) else {
throw FlutterError()
}
var request = URLRequest(url: url)
request.httpMethod = "POST"
request.allHTTPHeaderFields = headers
request.httpBody = try JSONSerialization.data(
withJSONObject: body,
options: .prettyPrinted
)
let task = URLSession.shared.dataTask(with: request) { data, _, err in
guard let data = data, err == nil else {
DispatchQueue.main.async {
result(err ?? FlutterError())
}
return
}
guard let json = try? JSONSerialization.jsonObject(
with: data,
options: .init(rawValue: 0)
) else {
DispatchQueue.main.async {
result(err ?? FlutterError())
}
return
}
DispatchQueue.main.async {
result(json)
}
}
task.resume()
} catch {
result(error)
}
}
I've omitted initialization parameters for FlutterError()
to make code more readable in the example. Also it's required to make FlutterError
conforming to Error
protocol in order to be able to throw
it or you can use custom Error
. Also example requires better exception handling but don't judge too harsh. This approach helped me, I hope will help someone else too.
Guys, we are talking about DNS resolution. shouldn't it be done once per URL? After doing one call that takes 10s , shouldn't the application takes the value from the DNS cache? Is that what happening? or for every call to same URL it takes 10s+ ?
@aviadavi This is what I suppose the DNS to work. Unfortunately, every call takes 10s+.
I guess is because the IPv6 resolution failed, thus subsequent call still needs to try resolving.
@rxwen Could you please try my fix patch(https://dart-review.googlesource.com/c/sdk/+/143883)? I don't have IOS device to test it. If this solves your problem, I can land it.
@zichangg I'd like to try it. But I'm not familiar with how to apply the patch on flutter environment, to build a ios app based on it.
I know how to apply patch for my flutter environment. It looks like flutter command will download the dart engine and sdk binaries automatically. So, I'm not sure how to instruct the flutter command to use a patch-ed sdk. Do you have any guide on this?
@rxwen Thanks for the help. ^_^ As far as I know, there is no easy way. You either have to have flutter engine or Dart sdk.
If you do have flutter engine repo locally, dart sdk lives in ${flutter_engine_repo}/third_party/dart
. You can apply the patch there(A simple copy-paste should be enough).
This page talks on how to build engine on IOS. You can substitute the step 1 & 2 with applying the patch.
@zanderso Hi Zach, do you know any easier way of rebuild dart from flutter?
I'll look into it and let you know if it works. @zichangg
@zichangg I cloned the engine repo. But there isn't a third_party/dart directory. Am I getting the correct repo?
@rxwen You'll have to follow https://github.com/flutter/flutter/wiki/Setting-up-the-Engine-development-environment to get everything in the right place to apply the patch.
@rxwen I think you should follow what Zach has posted. He is an expert!
any progress on this?
@100cm I have a fixing cl ready but I'm not able to test whether that fixes the problem. Thanks to @rxwen , he is trying to test my patch.
@zichangg There are so many tools new to me. It took time to download the whole repo and setup the build environment for it.
Looks like I'm stucked in the flutter/tools/gn --unoptomized
step. It complains about python3 compatibility.
Then I modify the gn tool source code to force it use python2, it failed with following error:
$ ./flutter/tools/gn --unoptimized
Generating GN files in: out/host_debug_unopt
ERROR at //build/config/ios/ios_sdk.gni:38:9: Script returned non-zero exit code.
exec_script("ios_sdk.py", [ "iphonesimulator" ], "list lines")
^----------
Current dir: /Users/lisa/projects/engine/src/out/host_debug_unopt/
Command: python /Users/lisa/projects/engine/src/build/config/ios/ios_sdk.py iphonesimulator
Returned 1.
stderr:
File "/Users/lisa/projects/engine/src/build/config/ios/ios_sdk.py", line 12
print "Takes one arg (SDK to find)"
^
SyntaxError: invalid syntax
See //build/toolchain/mac/BUILD.gn:9:1: whence it was imported.
import("//build/config/ios/ios_sdk.gni")
^--------------------------------------
See //BUILD.gn:6:1: which caused the file to be included.
group("default") {
^-----------------
Any idea what is the cause?
@rxwen Thanks for the effort. I'm not quite familiar with it either.
Did you pass --ios
as well. (See the instructions in the IOS section).
Looks like sys.argv
of running python /Users/lisa/projects/engine/src/build/config/ios/ios_sdk.py iphonesimulator
is not 2. Does this get changed by forcing python 2?
if len(sys.argv) != 2:
print "Takes two args (SDK to find and the version)"
@zichangg @zanderso maybe somebody with appropriate Mac hardware could produce a local engine build with a patch for @rxwen and package it? I have done this before with android builds - you just need to package a subset of output files to make --local-engine
work?
@cbracken mentioned to me that @gaaclarke can help.
@gaaclarke Could you do a local engine build with @zichangg's patch on the Dart SDK, and get it to @rxwen to try out? Thanks!
edit: @jmagman has done this already, sorry for the confusion!
I have it set up, I was waiting on Friday for @zichangg for details on which networking scenarios needed testing. (see internal Flutter & Dart chat)
I tried to reproduce this without the patch first. My first thought was to use the Network Link Conditioner to increase the IPv6 DNS delay to simulate a timeout. However, I don't see enough knobs in there to do that while also allowing through IPv4 without a delay, so with and without the patch every lookup took the full delay amount.
My macOS corp machine doesn't allow network sharing to set up anything more sophisticated there to then share with my iOS device. And if I misconfigure my network during the quarantine I don't think anyone in my house will be pleased.
@rxwen Any hints for how I can reproduce this? Alternatively, I can share the engine binaries I built in Google Drive, if you have a gmail address. If you don't want to share it here, you can give me a 👋 on Discord in the #hackers-dart channel. Or you can try to build the engine again with python3 (you didn't specify what error you were hitting).
The issue has been discussed in flutter, move it here since it's more relevant.
Our environment is iOS version: 13.3.1. iPhone 6S. (And some customers reported same problems on different devices, iPhone 7, iPhone X, latest iOS system as of now).
As a reference, I created a flutter sample project, with
flutter create --androidx -t app sample
Then, add some code in main.dart to perform a simple http get request, whenever user clicks the + button and display the time spent. (https://gist.github.com/rxwen/009c7ae4328ee799f71013f06e206b13#file-main-dart-L55) This simplest example, exhibit the same behavior.
I run sample on flutter v1.12.13+hotfix.8, v1.9.1 in debug and release mode. All got the similar results.
Further testing showed that it may be related to dns. I tried change the url to its ip address instead of domain name, the speed isn't slow at all. And if I set the iPhone's dns to manual mode, and remove the ipv6 dns server (which is configured automatically), the problem is gone too.
So, here is the question, how come the ipv6 dns server will affect the flutter app's dns resolution? While other apps are not affected. It's not a feasible solution if we have to ask our app user to configure the dns manually when they encounter this problem. There still should be something different in dart code base that handle the dns in a different way.