square / retrofit

A type-safe HTTP client for Android and the JVM
https://square.github.io/retrofit/
Apache License 2.0
42.97k stars 7.3k forks source link

Throwing a non-IOException from an OkHttp interceptor cancels a call without an event on the Rx stream #3453

Open realdadfish opened 4 years ago

realdadfish commented 4 years ago

If a non-IOException-derived exception is thrown from an OkHttp interceptor, an RxJava3 call adapter stalls indefinitely and does not signal the exception downstream. (This used to work with the RxJava2 call adapter.)

This is a quick reproducer: https://gist.github.com/realdadfish/c7c0d8cb123c583acd3e683b68afbfa5

The current workaround is to simply make all custom exceptions thrown from within interceptors derive from IOException.

Retrofit: 2.9.0 OkHttp: 4.8.1 RxJava: 3.0.6

swankjesse commented 4 years ago

In OkHttp we consider exceptions other than IOException to be an interceptor crash. This is because our API is originally designed for Java where the only exceptions we deliver to callbacks are IOExceptions.

Your proposed fix is the right one. Subclass IOException if you want to throw from an interceptor without crashing the application.

realdadfish commented 4 years ago

I just got notice of https://github.com/square/okhttp/issues/5151 that explains the issue in more detail. Working around with IOExceptions is cool for me, what I dislike however was that I apparently saw the (wrong) exception occur nowhere else, it simply vanished, I had no crashing (test) process. Can you elaborate a bit more what one should have in place to see the programming mistake more directly?

realdadfish commented 4 years ago

Also, I was merely bumping dependencies (OkHttp from 4.6.0 to 4.8.1, Retrofit from 2.8.1 to 2.9.0 to get Rx3 support), so with the previous dependencies and Rx2 this issue did not pop up. This is weird because the aforementioned OkHttp issue was set to be included in OkHttp 4.3 already, so I wonder what else changed...

emartynov commented 4 years ago

@swankjesse should the non-IOException lead to crash in this case?

swankjesse commented 4 years ago

Added some docs with my thoughts: https://github.com/square/okhttp/pull/6235/files

YaowenGuo commented 4 years ago

If a non-IOException-derived exception is thrown from an OkHttp interceptor, an RxJava3 call adapter stalls indefinitely and does not signal the exception downstream. (This used to work with the RxJava2 call adapter.)

This is a quick reproducer: https://gist.github.com/realdadfish/c7c0d8cb123c583acd3e683b68afbfa5

The current workaround is to simply make all custom exceptions thrown from within interceptors derive from IOException.

Retrofit: 2.9.0 OkHttp: 4.8.1 RxJava: 3.0.6

@realdadfish

I think this error is not caused by retrofit or OkHttp. I have seen several version OkHttp and Retrofit. They all not catch non-IoException derived exception, Because OkHttp only throw IOException internal. It is cased by RxJava adapter. When use Retrofit.Builder().addCallAdapterFactory(RxJava3CallAdapterFactory.create()) to execute network call. it will call subscribeActual() function in CallEnqueueObservable. It not catch non-IoException derived exception thrown by OkHttp.

  @Override
  protected void subscribeActual(Observer<? super Response<T>> observer) {
    // Since Call is a one-shot type, clone it for each new observer.
    Call<T> call = originalCall.clone();
    CallCallback<T> callback = new CallCallback<>(call, observer);
    observer.onSubscribe(callback);
    if (!callback.isDisposed()) {
      call.enqueue(callback);  // <---------not catch other exception.
    }
  }

You can use RxJava3CallAdapterFactory.createWithScheduler() or RxJava3CallAdapterFactory.createSynchronous() like this.

val retrofit = Retrofit.Builder()
      .client(okhttp)
      .addCallAdapterFactory(RxJava3CallAdapterFactory.createSynchronous())
       // Or 
       // .addCallAdapterFactory(RxJava3CallAdapterFactory.createWithScheduler(Schedulers.io()))
      .baseUrl(server.url("/"))
      .build()

It will use CallExecuteObservable to execute http request. It will catch the non-IoException derived exception and call Rxjava downstream's onError function.

  protected void subscribeActual(Observer<? super Response<T>> observer) {
   ...
    try {
      Response<T> response = call.execute();
      ...
    } catch (Throwable t) {
      Exceptions.throwIfFatal(t);
      if (terminated) {
        ...
      } else if (!disposable.isDisposed()) {
        try {
          observer.onError(t);
        } catch (Throwable inner) {
          ...
        }
      }
    }
  }
realdadfish commented 4 years ago

@YaowenGuo Very good overview, indeed! Now the question is: If any user picks any of the factory methods (create(), createSynchronous(), createWithScheduler()) is he/she aware that this comes at a price that his error behaviour is different? So, shouldn't the error behaviour not be the same for all three?

realdadfish commented 4 years ago

Also, I think the real culprit is really okhttp, look at the implementation of RealCall.java:

  @Override protected void execute() {
      boolean signalledCallback = false;
      transmitter.timeoutEnter();
      try {
        Response response = getResponseWithInterceptorChain();
        signalledCallback = true;
        responseCallback.onResponse(RealCall.this, response);
      } catch (IOException e) {
        if (signalledCallback) {
          // Do not signal the callback twice!
          Platform.get().log(INFO, "Callback failure for " + toLoggableString(), e);
        } else {
          responseCallback.onFailure(RealCall.this, e);
        }
      } catch (Throwable t) {
        cancel();       // <-- this is IMHO too early
        if (!signalledCallback) {
          IOException canceledException = new IOException("canceled due to " + t);
          canceledException.addSuppressed(t);
          responseCallback.onFailure(RealCall.this, canceledException);
        }
        throw t;
      } finally {
        client.dispatcher().finished(this);
      }

and here of the CallEnqueueObservable.java where it exits after it realizes the call is already canceled:

 @Override
    public void onFailure(Call<T> call, Throwable t) {
      if (call.isCanceled()) return;          // <-- yeah, we're canceled and do not process the error, leading to an indefinite open stream

      try {
        observer.onError(t);
      } catch (Throwable inner) {
        Exceptions.throwIfFatal(inner);
        RxJavaPlugins.onError(new CompositeException(t, inner));
      }
    }
YaowenGuo commented 4 years ago

@YaowenGuo Very good overview, indeed! Now the question is: If any user picks any of the factory methods (create(), createSynchronous(), createWithScheduler()) is he/she aware that this comes at a price that his error behavior is different? So, shouldn't the error behavior not be the same for all three?

Yes, I also confused by this difference。I also its behavior will be consistent for downstream.

jonfhancock commented 3 years ago

I ran into this problem yesterday via a suspending function in a Retrofit interface.

I was extremely confused because I was able to try/catch the suspending function call, and catch the exception, but the app still crashed at ReallCall.AsyncCall.run.

It makes sense now why this happens, but it took a full day of digging and debugging to figure it out.

@swankjesse's updated docs help a lot. The current docs for Interceptor don't explain this at all.

I would like to suggest two more things to make it easier for developers to figure out that only descendants of IOException are expected.

First, It would be good to have a log statement in the catch(t: Throwable){} clause like:

Platform.get().log("Interceptor error for ${toLoggableString()}. Interceptors are only expected to throw subclasses of IOException.", Platform.INFO, t)

Second, it would be nice if it wrapped the unexpected error, and gave it more detail. e.g.

throw IllegalStateException("Unexpected error thrown by Interceptor.  Only expected IOException and descendants",t)

This would still allow the app to crash, but would at least tell developers exactly why they're seeing a stack trace that doesn't point to the call site of the Call.

My only concern with the second suggestion is whether changing the exception that is thrown is likely to break things for consumers of the API.