GoogleCloudPlatform / cloud-sql-jdbc-socket-factory

A collection of Java libraries for connecting securely to Cloud SQL
Apache License 2.0
229 stars 118 forks source link

Excessive dependencies of the java libraries are causing cascading failures when attempting to upgrade #1921

Open turneand opened 5 months ago

turneand commented 5 months ago

Bug Description

Not sure if this is the best place to raise this issue, as it's a general concern we keep facing with all of the google provided java APIs. But, we are specifically trying to acquire the fix in this library in version 1.17.0 to fix r2dbc health when using REMOTE validation, but are currently unable to.

Where our web applications use dependency management frameworks (such as spring-boot or micronaut) that expose netty servers, and are complex applications in their own right that need version management, upgrades, etc, we find ourselves having to manually tweak all of the google client libraries every time an upgrade comes through, and hope there's no breaking changes. For the r2dbc issue we have gax and netty incompatibilities with those provided by other google libraries (pubsub, otel, etc). We've got places where the guava version used has been overwritten from the "jre" version to the "android" version due to the cloud-sql-connector-r2dbc-postgres itself requiring the android version, but its own transitive dependencies require the jre version (this causes compatibility issues with the google otel libraries that require the jre, and crash out at runtime if only the android version is installed.

I understand the intention is that we should use the bom, but this is incompatible with other management frameworks, as it overrides "standard" libraries (such as netty), and we cannot really use the uber-jar's as if we did that for each library we'd end up our releases being several hundreds of MB (even with just a single google client dependency, it is already around the 100MB mark even for trivial apps).

So, the ask is, is it possible to reduce the dependencies of these libraries, or have more parts defined as optional as for example when deploying an simple web application within a GKE application using the r2dbc libraries, we do not use hardly any of them.

Example code (or command)

No response

Stacktrace

No response

Steps to reproduce?

Create new project with just the "cloud-sql-connector-r2dbc-postgres" dependency added, and look at the dependency tree.

Environment

all

Additional Details

No response

enocom commented 5 months ago

Thanks for the report, @turneand. Let me raise this with our Cloud Java team to see if what we can do here.

enocom commented 5 months ago

@meltsufin Do you know of any related issues covering dependency sizes?

meltsufin commented 5 months ago

I think it would be helpful to get into the specifics. Yes, we hope that the users can adopt the Libraries BOM which ensures dependency compatibilities. Which dependencies are preventing you from adopting the BOM?

turneand commented 5 months ago

So there's a few different examples here, but I'll try to cover them...

1) Neither com.google.cloud.sql#cloud-sql-connector-r2dbc-postgres or com.google.cloud.sql#postgres-socket-factory are defined in the libraries-bom file, so we have to manually link up versions, and find something compatible. I can't find any references that indicate the libraries-bom should be used when using these components directly, or a compatibility matrix.

2) When using spring-boot, and its spring-cloud-gcp-dependencies-5.1.0, this has a dependency management section defining it exposes cloud-sql-socket-factory 1.16.0, which is used for jdbc and r2dbc versions. If we choose to specifically override just the cloud-sql-socket-factory version to pick up the one fix we need, this ends up causing conflicts with the otheriwse tested transitive dependencies for other libraries (notably gax, see below).

3) When explicitly overriding cloud-sql-connector to the latest version, and if you have google-cloud-secretmanager (2.33.0 in our case), or google-cloud-pubsub (5.4.1 in our case) then the cloud-sql-connector will explicitly define a dependency on a more recent version of gax which then breaks the api compatibility of gax-grpc and gax-httpjson brought in from the pubsub/secretmanager libraries

4) We use a mixture of java application frameworks in our teams, all of which contain some form of dependency management. For example, we have projects using micronaut which exposes its own dependency versions for its primary function as a netty server. However, when we start adding the google libraries then we get conflicts with those brought in by the framework (and when using maven it gets even trickier as to what "wins" when using a combination of parent-poms, and imported dependencies). As such, we are often having to completely forgo the usage of the google client libraries, and instead use the APIs directly. Note that micronaut does have a "gcp" dependency management section, but this also does not use the libraries-bom, so we have similar issues there if we start using multiple dependency management frameworks. This is primarily an issue when using tools like renovate/dependabot to upgrade one component at a time to reduce the risk, but that doesn't seem possible any more as we have to now upgrade numerous components in one go due to their coupling.

5) Unfortunately we are limited like several other organisations I know that final binary size does matter (due to transfer/storage costs of delivered artifacts), this is not ideal, but it's also difficult to change the mindset. The simplest example I have is of adding com.google.cloud#google-cloud-secretmanager#2.38.0 to an otherwise empty project brings in 42mb of dependencies, including a shaded version of netty(?) for grpc and several other large binaries. The reason I raise this one is that we are able to easily replace this one by using the rest API directly, and we are perfectly ok accepting any perceived performance hit by not using grpc as they are accessed infrequently. The cloudsql libraries have similar/larger dependency hierarchy but are proving more complicated to find an alternative way to do the IAM authentication part (connectivity/certificate management is all ok).

meltsufin commented 5 months ago

@turneand Thanks for the explanation. It seems like the root issue is that the Cloud SQL connectors use GAX, but are not in the Libraries BOM. I believe you can just exclude GAX from Cloud SQL dependencies because it seems to be only there for GraalVM support.

cc: @suztomo @mpeddada1

ttosta-google commented 5 months ago

The GAX dependency scope to provided now (#1924).

turneand commented 4 months ago

So that is going to help on one of the issues, but the same underlying issue around dependencies still remains.

For example, we've now found some show-stopper bugs for us in the pubsub libraries, that means I think we are going to have to downgrade them. However, due to the complex dependencies between these cloudsql libraries, and the pubsub libraries, we are going to have a bit of a problem finding something that is compatible. Which is also problematic as even if we used the libraries-bom, we'd have issues, as I still cannot find anything around compatibility of these cloudsql drivers and libraries-bom

As far as I understand it, I think the only option we've really got now is to go back to using the cloud-sql-proxy? But even that seems overkill for a service running in GCE/GKE.

enocom commented 4 months ago

I agree -- it seems unnecessary to have to use the Proxy when the Java Connector would otherwise work just as well.

Is IAM authentication the primary motivation for using the Java Connector? I could show you how to do IAM authentication with a plain HikariCP data source if there's interest.

turneand commented 4 months ago

@enocom - would definitely be interested in a more "native" option for IAM authentication for when we don't need the full capabilities of the proxy options. All examples I found regarding IAM explicitly stated to use these connectors, but if we have a lighter option would be good.

enocom commented 4 months ago

Here's how you get IAM authentication with token refresh without the Connectors.

First, subclass HikariDataSource like so:

package dev.enocom.dbaccess;

import com.google.auth.oauth2.AccessToken;
import com.google.auth.oauth2.GoogleCredentials;
import com.zaxxer.hikari.HikariConfig;
import com.zaxxer.hikari.HikariDataSource;
import java.io.IOException;

public class CloudSqlAutoIamAuthnDataSource extends HikariDataSource {

  public CloudSqlAutoIamAuthnDataSource(HikariConfig configuration) {
    super(configuration);
  }

  @Override
  public String getPassword() {
    GoogleCredentials credentials;
    try {
      credentials = GoogleCredentials.getApplicationDefault();
    } catch (IOException err) {
      throw new RuntimeException(
          "Unable to obtain credentials to communicate with the Cloud SQL API", err);
    }

    // Scope the token to ensure it's scoped to logins only.
    GoogleCredentials scoped = credentials.createScoped(
        "https://www.googleapis.com/auth/sqlservice.login");

    try {
      scoped.refresh();
    } catch (IOException e) {
      throw new RuntimeException(e);
    }
    AccessToken accessToken = scoped.getAccessToken();
    return accessToken.getTokenValue();
  }
}

Then use the data source like this:

package dev.enocom.dbaccess;

import com.zaxxer.hikari.HikariConfig;
import com.zaxxer.hikari.HikariDataSource;
import javax.sql.DataSource;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.context.annotation.Bean;

@SpringBootApplication
public class Application {

  public static void main(String[] args) {
    SpringApplication.run(Application.class, args);
  }

  @Bean
  DataSource getDataSource() {
    HikariConfig config = new HikariConfig();

    config.setJdbcUrl("jdbc:postgresql://10.0.0.2/postgres");
    config.setUsername("my-sa@my-project.iam");
    config.addDataSourceProperty("ssl", "true");
    // You can enforce this on the server too (without needed client certs)
    config.addDataSourceProperty("sslmode", "require"); 

    return new CloudSqlAutoIamAuthnDataSource(config);
  }
}
turneand commented 4 months ago

Thanks @enocom , is there a recommended implementation for r2dbc? Is there any call for providing these as a light weight library?

enocom commented 4 months ago

The R2DBC version would look like this:

ConnectionFactoryOptions options = ConnectionFactoryOptions.parse("r2dbc:postgresql://host/database");

ConnectionFactory connectionFactoryStub = ConnectionFactories.get(options);

Mono<? extends Connection> connectionPublisher = Mono.defer(() -> {
    GoogleCredentials credentials;
    try {
      credentials = GoogleCredentials.getApplicationDefault();
    } catch (IOException err) {
      throw new RuntimeException(
          "Unable to obtain credentials to communicate with the Cloud SQL API", err);
    }

    // Scope the token to ensure it's scoped to logins only.
    GoogleCredentials scoped = credentials.createScoped(
        "https://www.googleapis.com/auth/sqlservice.login");

    try {
      scoped.refresh();
    } catch (IOException e) {
      throw new RuntimeException(e);
    }
    AccessToken accessToken = scoped.getAccessToken();

    ConnectionFactoryOptions optionsToUse = options.mutate()
    // provide a new password each time we see a connect request
        .option(ConnectionFactoryOptions.PASSWORD, accessToken.getTokenValue())
        .build();

    return Mono.from(ConnectionFactories.get(optionsToUse).create());
});

ConnectionFactory myCustomConnectionFactory = new ConnectionFactory() {

    @Override
    public Publisher<? extends Connection> create() {
        return connectionPublisher;
    }

    @Override
    public ConnectionFactoryMetadata getMetadata() {
        return connectionFactoryStub.getMetadata();
    }
};

ConnectionPoolConfiguration poolConfiguration = ConnectionPoolConfiguration.builder().connectionFactory(myCustomConnectionFactory).build();
ConnectionPool pool = new ConnectionPool(poolConfiguration);
enocom commented 4 months ago

And as for providing these as a lightweight library, yes, we've been thinking about that but haven't made a decision.

cc @jackwotherspoon as FYI