gql-dart / gql

Libraries supporting GraphQL in Dart
MIT License
267 stars 121 forks source link

[gql_websocket_link] Crash events on host lookup #423

Open bverhagen opened 10 months ago

bverhagen commented 10 months ago

I switched to gql_websocket_link to connect to my GraphQL backend (before I used hasura_connect). Since this switch, Firebase Crashlytics reports:

WebSocketChannelException: WebSocketChannelException: SocketException: Failed host lookup: '<myhost>' (OS Error: No address associated with hostname, errno = 7)

I think this is due to the connection being interrupted by me switching off mobile and wifi connectivity. So, obviously, the actual error makes sense. However, I don't think the app should crash nor be reported on Crashlytics when I do.

Currently, the connection is implemented as:

final link = WebSocketLink(
      "wss://<myhost>",
      autoReconnect: true,
      inactivityTimeout: const Duration(seconds: 5),
      initialPayload: () async {
        final authHeaders = await getAuthHeaders();
        return {
          'headers': {
            'x-hasura-role': role.id,
            ...authHeaders,
          }
        };
      },
    );

I thought switching to the channelGenerator argument (with a null url), I would be able to wrap the WebSocketChannelException in a try/catch clause to avoid this. However, somehow, the SocketException keeps on escaping my try/catch and hence keeps on crashing the app:

final link = WebSocketLink(
      null,
      channelGenerator: () {
        while(true) {
          try {
            return WebSocketChannel.connect(Uri.parse("wss://<myhost>"));
          } catch(_) {

          }
        }
      },
      autoReconnect: true,
      inactivityTimeout: const Duration(seconds: 5),
      initialPayload: () async {
        final authHeaders = await getAuthHeaders();
        return {
          'headers': {
            'x-hasura-role': role.id,
            ...authHeaders,
          }
        };
      },
    );

My main/AndroidManifest.xml contains the android.permission.INTERNET permission:

<manifest xmlns:android="http://schemas.android.com/apk/res/android"
    package="eco.futureproof.app">
    <uses-permission android:name="android.permission.FOREGROUND_SERVICE" />
    <uses-permission android:name="android.permission.INTERNET" />
    <application ...>
      <my application stuff>
    </application>
 </manifest>

Even though I really doubt I am the first one to see this issue, I do not find any related issues. Can you point me to an example or the typical solution to fix this?

bverhagen commented 10 months ago

I just realized that, around the same time, I switched on fatal and async error recording for Firebase Crashlytics (which may explain why I did not see this happen with hasura_connect):

FlutterError.onError = FirebaseCrashlytics.instance.recordFlutterFatalError;

PlatformDispatcher.instance.onError = (error, stack) {
  FirebaseCrashlytics.instance.recordError(error, stack, fatal: true);
  return true;
};

I assume it is the async error recording that is the culprit here. Still, is there a way to catch it?

bverhagen commented 10 months ago

In the meantime I found this: https://github.com/dart-lang/web_socket_channel/issues/38.

I am going to test whether the suggested solution of awaiting socket.ready will allow me to catch the error in the channelGenerator callback.

bverhagen commented 10 months ago

Preliminary testing seems to indicate this fixes it for initial connection errors:

channelGenerator: () async {
        while (true) {
          final socket = WebSocketChannel.connect(
              Uri.parse("wss://$apiDomain/v1/graphql"));
          try {
            await socket.ready;
            developer
                .log("Successfully connected with websocket to '$apiDomain'!");
            return socket;
          } catch (e, stacktrace) {
            developer.log("Error connecting to websocket: $e");
            await Future.delayed(const Duration(seconds: 5));
          }
        }
      }

Shouldn't the default channelGenerator be robust against this to?

Additionally, returning the errored socket, also results in an uncaught async error later on in its usage.