googleapis / gax-java

This library has moved to https://github.com/googleapis/sdk-platform-java/tree/main/gax-java.
https://github.com/googleapis/gapic-generator-java/tree/main/gax-java
BSD 3-Clause "New" or "Revised" License
162 stars 119 forks source link

Parallel requests - unexpected DEADLINE_EXCEEDED errors #1132

Closed nwbirnie closed 4 years ago

nwbirnie commented 4 years ago

An Ads API customer reports that they are experiencing higher than expected DEADLINE_EXCEEDED errors after switching some calls to run in parallel.

They should have a separate InstantiatingGrpcTransportChannelProvider for each thread.

There is no sign of the DEADLINE_EXCEEDED error in our server logs (ESF proxy). They are sending deadline=60mins and the longest running request is ~ 2 seconds.

Thoughts/questions:

Here are our dependency versions.

    <netty.version>2.0.26.Final</netty.version>
    <gax.version>1.50.1</gax.version>
    <grpc.version>1.25.0</grpc.version>
    <protobuf.version>3.10.0</protobuf.version>
igorbernstein2 commented 4 years ago

Would the customer be willing to enable opencensus tracing export to stack driver? Awhile back I implemented tracing for cloud bigtable and pushed it down into gax. The traces should give you and the customer more insight as to whats happening.

You can enable tracing in gax by:

Also looking at the sample code in the linked bug:

try (CampaignBudgetServiceClient campaignBudgetServiceClient =
        googleAdsClient.getLatestVersion().createCampaignBudgetServiceClient()) 

It seems like they are creating a new client for each request. From my experience with cloud bigtable, this is very problematic. By default a gapic client create a grpc connection with every client, this is a very heavyweight operation. A grpc connection is meant to be long lived, usually scoped to the lifetime of the application. Opening a connection for every request, will probably be interpreted as a DOS attack by the GFEs.

A grpc connection can have 100 outstanding requests. If the customer needs more outstanding requests or this is hot spotting an AFE. Then the channel pooling feature of the InstantiatingGrpcTransportChannelProvider should be used.

In any case, I dont think this is a gax issue. Most likely the issue is in the customer application or how the ads client is implemented

nwbirnie commented 4 years ago

@igorbernstein2 thank you so much for monitoring and sharing your advice - this was super helpful!

nwbirnie commented 4 years ago

Another thought occurs - this customer has a different OAuth credential for each request. Am I right in saying that, currently, there is no good model to allow multiple credentials to use the same Channel?

igorbernstein2 commented 4 years ago

I believe you can override the call credentials using the ApiCallContext

hkdevandla commented 4 years ago

@nwbirnie, I assume this issue is related to our recent discussions around retry options. I'm closing this issue for now since we have a path forward for the fix in short term and also long term (as part of micro-generator). Please re-open otherwise. Thanks!