gql-dart / gql

Libraries supporting GraphQL in Dart
MIT License
266 stars 119 forks source link

Random ServerException #465

Open Masadow opened 1 month ago

Masadow commented 1 month ago

While on production, I receive random errors both in android and web (I haven't tested other platforms)

On web, it resembles to :

ServerException(originalException: ClientException: XMLHttpRequest error., uri=https://api.v2.medami.fr, originalStackTrace: Error
    at Object.ccc (https://patient.v2.medami.fr/main.dart.js:8636:19)
    at bJ9.$1 (https://patient.v2.medami.fr/main.dart.js:124760:63)
    at Object.cPc (https://patient.v2.medami.fr/main.dart.js:7401:19)
    at b4v.<anonymous> (https://patient.v2.medami.fr/main.dart.js:318907:10)
    at bxv.q7 (https://patient.v2.medami.fr/main.dart.js:70756:12)
    at cGj.$0 (https://patient.v2.medami.fr/main.dart.js:70021:11)
    at Object.atE (https://patient.v2.medami.fr/main.dart.js:7263:40)
    at aS.ld (https://patient.v2.medami.fr/main.dart.js:69941:3)
    at cOb.$0 (https://patient.v2.medami.fr/main.dart.js:70546:20)
    at Object.cPa (https://patient.v2.medami.fr/main.dart.js:7395:19), parsedResponse: null)

On android :

ServerException(originalException: ClientException with SocketException: Software caused connection abort (OS Error: Software caused connection abort, errno = 103), address = api.v2.medami.fr, port = 44844, uri=https://api.v2.medami.fr, originalStackTrace: *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
pid: 18642, tid: 18715, name 1.ui
os: android arch: arm64 comp: yes sim: no
build_id: 'b866ace19d6955e3f7dae5b7ee781c6c'
isolate_dso_base: 7036011000, vm_dso_base: 7036011000
isolate_instructions: 7036377580, vm_instructions: 7036361000
    #00 abs 000000703712b433 virt 000000000111a433 _kDartIsolateSnapshotInstructions+0xdb3eb3
<asynchronous suspension>
    #01 abs 00000070371179f3 virt 00000000011069f3 _kDartIsolateSnapshotInstructions+0xda0473
<asynchronous suspension>
    #02 abs 00000070371174cb virt 00000000011064cb _kDartIsolateSnapshotInstructions+0xd9ff4b
<asynchronous suspension>
    #03 abs 0000007037117133 virt 0000000001106133 _kDartIsolateSnapshotInstructions+0xd9fbb3
<asynchronous suspension>
, parsedResponse: null)

I've seen another issue (#358) but I doubt it's linked to CORS since my error is random and not systematic + it also occurs on android.

Relevant code :

await gql.client.request(GGetStepReq((b) => b..vars.id = stepId)).first

gql being of class

import 'package:flutter/widgets.dart';
import 'package:ferry/ferry.dart';
import 'package:ferry_hive_store/ferry_hive_store.dart';
import 'package:hive/hive.dart';
import 'package:gql_exec/src/request.dart';
import 'package:medami_utils/services/auth.dart';
import 'package:medami_utils/services/graphql/log.dart';
import 'package:provider/provider.dart';

class GraphQLClient extends ChangeNotifier {
  static late final Cache cache;
  static late final String endpoint;
  static late final Function(Request request, LinkException e) logError;
  late Client client;
  String? token;

  static init(String endpoint, Function(Request request, LinkException e) logError) async {
    GraphQLClient.endpoint = endpoint;
    GraphQLClient.logError = logError;

    final box = await Hive.openBox("graphql");

    final store = HiveStore(box);

    cache = Cache(store: store);
  }

  void build(String? token) {
    this.token = token;

    client = Client(
      link: HttpLinkWithLog(endpoint, token, logError),
      cache: cache,
      defaultFetchPolicies: {
        OperationType.query: FetchPolicy.NetworkOnly,
      }
    );

    notifyListeners();
  }
}

class GraphQL extends StatelessWidget {
  GraphQL({super.key, required this.child});

  final GraphQLClient _graphql = GraphQLClient();
  final AuthToken _auth = AuthToken();

  final Widget Function(GraphQLClient) child;

  @override
  Widget build(BuildContext context) {
    return ChangeNotifierProvider(
      create: (context) => _auth,
      child: Consumer<AuthToken>(
        builder: (context, authToken, c) {
          print('rebuild graphql client with token: ${authToken.value}');
          _graphql.build(authToken.value);
          return c!;
        },
        child: ChangeNotifierProvider(
          create: (context) => _graphql,
          child: child(_graphql),
        ),
      ),
    );
  }
}

Custom httplink to log link errors globally

import 'dart:async';

import 'package:ferry/ferry.dart';
import 'package:gql_exec/src/request.dart';
import 'package:gql_exec/src/response.dart';
import 'package:gql_http_link/gql_http_link.dart';

class HttpLinkWithLog extends HttpLink {
  HttpLinkWithLog(endpoint, token, this.logError) : super(
    endpoint,
    defaultHeaders: {
      if (token !=  null) 'Authorization': token,
    },
  );

  Function(Request request, LinkException e) logError;

  @override
  Stream<Response> request(
    Request request, [
    NextLink? forward,
  ]) async* {
    final controller = StreamController<Response>();

    Future<void>(() async {
      try {
        await for (final response in super.request(request)) {
          controller.add(response);
        }
      } on LinkException catch (e) {
        logError(request, e);
        controller.addError(e);
      } finally {
        await controller.close();
      }
    });

    yield* controller.stream;
  }
}

The graphql object is not being rebuilt when the error occured so it can't be the cause of the issue too

At this point, I have no idea where to look at to understand and fix the issue.

Since it's very random (happen maybe 1/50), it's impossible for me to build a small repro as well

knaeckeKami commented 1 month ago

"Software caused connection abort" on mobile devices happens typically when a request is in flight, the app is backgrouded and the OS kills any open sockets to save resources.

Nothing that gql could help with, that's just how mobile OS work.

But in my experience, the issue is less problematic when using native http implementations. See https://pub.dev/packages/http#2-configure-the-http-client https://pub.dev/packages/native_dio_adapter depending on which http implementation you use

Masadow commented 1 month ago

I'll look into both provided links for android, thanks. However the application run in kiosk mode, so it's never put in background, the issue is raised while user is interacting with the app so I'm very certain that the app is in foreground.

Would you suggest that I should have a retry strategy for every request made ?

What about the error occuring on web ?

knaeckeKami commented 1 month ago

likely nothing related to gql, but underlying network, proxy, firewall ... issues.

though you could try ErrorLink as workaround