googleapis / java-bigtable

Apache License 2.0
70 stars 86 forks source link

Bigtable API Client: 10 times more requests starting with release 2.10.0 resulting in 50% more cloud costs #1968

Closed julian-sotec closed 10 months ago

julian-sotec commented 11 months ago

Environment details

  1. Bigtable Client Library for Java (v 2.2.0 --> 2.10.0)
  2. OS type and version: Appengine 2nd Generation
  3. Java version: Java OpenJDK 11
  4. bigtable version(s): (v 2.2.0 --> 2.10.0)

Steps to reproduce

  1. Update Bigtable Client Library from version 2.2.0 to 2.10.0 or higher
  2. Deploy code sample using bulkMutateRows method of com.google.cloud.bigtable.data.v2.BigtableDataClient
  3. Observation Bigtable API Requests increase by factor 10 via APIs & Services
  4. Version 2.2.0: 1 requests
  5. Version 2.10.0: ~ 10 requests per bulk mutate row call

Code example

    BulkMutation bulkMutation = BulkMutation.create(tableId);
    for (int row = 0; row < data.getValues().size(); row++) {
        ...
        bulkMutation.add(rowKey, rowMutation);
    }
    dataClient.bulkMutateRows(bulkMutation);

External references such as API reference guides

Any additional information below

We are using a bom based approach which keeps the google java client libraries in sync across all used services (Datastore, Bigquery, Cloud Storage, Bigtable, Cloud Tasks) The specific method that we use to ingest data to bigtable is bulk mutate rows.

After updating & deploying we noticed that the bigtable api requests were 10 times higher compared to the previous deployed version. We narrowed it down being a problem starting with version 2.10.0 of the bigtable API client for Java https://cloud.google.com/bigtable/docs/release-notes#August_01_2022 It seems to be related to PingAndWarm requests being introduced before each interaction with bigtable in that specific version. Impact: 300% Latency increase on bigtable operations --> 50% More AppEngine Instances Spawned --> 40 - 50% more cloud costs.

We now decided to downgrade to Version 2.9.0 - but we are no longer in sync with other google client libraries since this is a manual maintenance step.

Because of this we want to understand:

Any insights on this topic is much appreciated.

igorbernstein2 commented 10 months ago

Hi,

PingAndWarm requests are sent once per channel during channel creation. Which happens when the client is initialized and every 50 mins there after. The purpose of PingAndWarm is to minimize perceived latency of a cold connection.By sending an RPC on each channel we ensure that it is established and all of the caches along the way have been warmed.

The lifecycle of the ChannelPool is tied to the client, which should be tied to life of your application process and should not be re-created per request. So to avoid 10x blow, please make sure you keep the client alive. 1 rpc per channel / 50 mins should not have a meaningful impact on a service thats expected to handle 10k qps per node.

If there is something that I'm missing and there is a usecase for shortlived clients or disabling channel priming, please reopen the ticket.

Thanks