GoogleCloudPlatform / appengine-java-vm-runtime

Apache License 2.0
67 stars 34 forks source link

API Timeouts are not respected #19

Open erugeri opened 9 years ago

erugeri commented 9 years ago

It seems to me that API (Datastore, Google Cloud Storage...) timeouts are not respected.

We have some requests that stay alive forever never finishing. Could be linked to ( https://github.com/GoogleCloudPlatform/appengine-java-vm-runtime/issues/17 ) but these requests don't have JSP, but heavy GCS or Datastore usage.

erugeri commented 9 years ago

As a shitty workaround we wrapped all of our important API calls (Datastore, GCS) inside a Future mecanism with a timeout. Very ugly but that was the only way we found to use MVM (need to use the cassandra driver) and avoid having to many zombie requests lasting forever.

    final Query<T> query = // Objectify Query
    CustomFuture<List<T>> future = new CustomFuture<>(new Callable<List<T>>() {
        @Override public List<T> call() throws Exception {
            return query.list();
        }
    });

    try {
        List<T> elements = future.get(45_000); // 45 s timeout
    } catch (TimeoutException e) {
        throw new RuntimeException(e);
    }

CustomFuture.java

public class CustomFuture<T> {

    private ExecutorService executor = Executors.newCachedThreadPool(ThreadManager.currentRequestThreadFactory());
    private Future<T> future;

    public CustomFuture(Callable<T> callable) {
        future = executor.submit(callable);
    }

    public T get(long timeoutInMs) throws TimeoutException {
        try {
            return future.get(timeoutInMs, TimeUnit.MILLISECONDS);
        } catch (TimeoutException e) {
            future.cancel(true);
            throw e;
        } catch (InterruptedException | ExecutionException e) {
            throw new RuntimeException(e);
        } finally {
            executor.shutdownNow();
        }
    }
}

For MemCache calls, we used the native Async service that provides natively the same solution.

That would be great if these timeouts were respected natively as in sandbox env.

By the way, GCS java connector has huge troubles lately (last few days), sending big files (~20mo) fails 9 out 10 times. We had to implement heavy timeout + retry mecanism.