goldmansachs / obevo

Obevo is a database deployment tool that handles enterprise scale schemas and complexity
Apache License 2.0
236 stars 56 forks source link

Potential of support for Redshift #134

Closed cgarlick closed 6 years ago

cgarlick commented 6 years ago

Hi Folks, We're really interested in trying this out, but against a Redshift database. We've been going through the setup but got stuck early on. The current implementation supports Postgresql, which would normally be fine against Redshift. But somewhere along the way it uses E' escape syntax when generating the sql it submits. Unfortunately that's not supported by Redshift.

So our question is either:

  1. Do you have any plans to support Redshift?
  2. Could you point us in the right direction of where the code generates the SQL for Postgres instances. We'd be happy to fork and see if we can work on a Redshift implementation for a future PR.

Thanks, Craig.

shantstepanian commented 6 years ago

Hi Craig,

1) We had no prior plans, but we are happy to start and to work with you on supporting Redshift.

To begin, can you please send:

If that alone doesn't help us in our diagnosis, we can look into setting up an environment ourselves to try this out. Besides the Getting Started doc, if you have any extra insights on setting up a basic cluster in your testing, please let us know.

2) In case you want to still look at the code:

Obevo code snippets (though I couldn't find any E' references in the code base)

We also leverage a library SchemaCrawler to access the DB metadata. Some of this relies on DB Metadata calls inside the JDBC driver; I wonder if your problematic SQL is coming from here. Here are code references:

Thanks, Shant

shantstepanian commented 6 years ago

Hi, I gave Redshift a try in my dev sandbox and:

  1. I think I found the issue
  2. And have a snapshot patch for you to try out.

Issue Diagnosis: Were you getting the following exception?

Caused by: org.postgresql.util.PSQLException: ERROR: type "e" does not exist

It looks like the reason is that Redshift isn't compatible with the PostgreSQL 9.x driver (see here). It suggests its own Redshift JDBC driver (preferred) or staying w/ the PostgreSQL 8.x driver.

Permanent Patch ETA

I will work on a permanent patch within the next couple weeks, but I’ll give you some snapshot binaries to try out in the meanwhile.

Snapshots to try out

Can you access my dropbox link?

There are two binaries:

The Redshift binary fixes the issue by taking in the Redshift jar and needing a couple minor changes in the postgres side (here, in case you are curious). You will also need to change your jdbc url from jdbc:postgres:// to jdbc:redshift:// I plan on going with this approach for the final fix, but I still need to productionize it (the commit I showed breaks the PostgreSQL support; I just committed it to give you an idea of the change needed).

The PostgreSQL binary fixes the issue by simply replacing the 9.x PostgreSQL driver with the 8.x driver; no Java code changes needed. I will not do this in the final approach as Amazon recommends moving to the Redshift driver, but I’ll send this over to you just in case.

Please give the binaries a try (preferably the Redshift binary) and let me know how it works for you.

cgarlick commented 6 years ago

Hi, thanks for the quick response. I can access the dropbox snapshots so will give them a try soon. To confirm your above question yes that was exception we had.

cgarlick commented 6 years ago

Hi, that seemed to get us a bit closer, Tried the redshift binary but hit a new error, I'm trying to run the deploy from kata lesson 1. here's the stack trace:

*** Exception stack trace ***
com.gs.obevo.db.impl.core.jdbc.DataAccessException: java.sql.SQLException: [Amazon](500310) Invalid operation: syntax error at or near "ROLE" 
Position: 8;
    at com.gs.obevo.db.impl.core.jdbc.JdbcHelper.updateInternal(JdbcHelper.java:107)
    at com.gs.obevo.db.impl.core.jdbc.JdbcHelper.update(JdbcHelper.java:68)
    at com.gs.obevo.db.impl.platforms.hsql.HsqlEnvironmentSetupInfra.setupEnvInfra(HsqlEnvironmentSetupInfra.java:72)
    at com.gs.obevo.db.impl.core.DbDeployerAppContextImpl.setupEnvInfra(DbDeployerAppContextImpl.java:405)
    at com.gs.obevo.db.impl.core.DbDeployerAppContextImpl.setupEnvInfra(DbDeployerAppContextImpl.java:71)
    at com.gs.obevo.db.cmdline.DbDeployerMain.start(DbDeployerMain.java:41)
    at com.gs.obevo.cmdline.AbstractMain.start(AbstractMain.java:124)
    at com.gs.obevo.dist.Main$3.value(Main.java:176)
    at com.gs.obevo.dist.Main$3.value(Main.java:173)
    at com.gs.obevo.dist.Main.execute(Main.java:115)
    at com.gs.obevo.dist.Main.execute(Main.java:83)
    at com.gs.obevo.dist.Main.main(Main.java:70)
Caused by: java.sql.SQLException: [Amazon](500310) Invalid operation: syntax error at or near "ROLE" 
Position: 8;
    at com.amazon.redshift.client.messages.inbound.ErrorResponse.toErrorException(Unknown Source)
    at com.amazon.redshift.client.PGMessagingContext.handleErrorResponse(Unknown Source)
    at com.amazon.redshift.client.PGMessagingContext.handleMessage(Unknown Source)
    at com.amazon.jdbc.communications.InboundMessagesPipeline.getNextMessageOfClass(Unknown Source)
    at com.amazon.redshift.client.PGMessagingContext.doMoveToNextClass(Unknown Source)
    at com.amazon.redshift.client.PGMessagingContext.getBindComplete(Unknown Source)
    at com.amazon.redshift.client.PGClient.handleErrorsScenario1(Unknown Source)
    at com.amazon.redshift.client.PGClient.handleErrors(Unknown Source)
    at com.amazon.redshift.client.PGClient.directExecuteExtraMetadata(Unknown Source)
    at com.amazon.redshift.dataengine.PGQueryExecutor.execute(Unknown Source)
    at com.amazon.jdbc.common.SStatement.executeNoParams(Unknown Source)
    at com.amazon.jdbc.common.SStatement.executeAnyUpdate(Unknown Source)
    at com.amazon.jdbc.common.SStatement.executeUpdate(Unknown Source)
    at org.apache.commons.dbcp.DelegatingStatement.executeUpdate(DelegatingStatement.java:228)
    at org.apache.commons.dbcp.DelegatingStatement.executeUpdate(DelegatingStatement.java:228)
    at com.gs.obevo.db.impl.core.jdbc.JdbcHelper.updateInternal(JdbcHelper.java:93)
    at com.gs.obevo.db.impl.core.jdbc.JdbcHelper.update(JdbcHelper.java:68)
    at com.gs.obevo.db.impl.platforms.hsql.HsqlEnvironmentSetupInfra.setupEnvInfra(HsqlEnvironmentSetupInfra.java:72)
    at com.gs.obevo.db.impl.core.DbDeployerAppContextImpl.setupEnvInfra(DbDeployerAppContextImpl.java:405)
    at com.gs.obevo.db.impl.core.DbDeployerAppContextImpl.setupEnvInfra(DbDeployerAppContextImpl.java:71)
    at com.gs.obevo.db.cmdline.DbDeployerMain.start(DbDeployerMain.java:41)
    at com.gs.obevo.cmdline.AbstractMain.start(AbstractMain.java:124)
    at com.gs.obevo.dist.Main$3.value(Main.java:176)
    at com.gs.obevo.dist.Main$3.value(Main.java:173)
    at com.gs.obevo.dist.Main.execute(Main.java:115)
    at com.gs.obevo.dist.Main.execute(Main.java:83)
Caused by: com.amazon.support.exceptions.ErrorException: [Amazon](500310) Invalid operation: syntax error at or near "ROLE"

I expect theres a sql statement being constructed somewhere that just isnt quite to Redshifts liking I'll try the postgres snapshot just incase thats any different

cgarlick commented 6 years ago

Sorry, you can completely ignore the previous comment. I had forgetten to set the config to Postgres, changed it and the kata code worked first time :smile: We'll try some more stuff with the snapshot later.

shantstepanian commented 6 years ago

No worries. As a side note, the error message should be clearer for you in case you had specified the dbType incorrectly in the config. I've created a ticket on our side (#137) to track this in the future

shantstepanian commented 6 years ago

I’ve merged the fix into the master branch. I’ve uploaded the -RC2 jar into the same dropbox link.

The main difference from the previous version is that you should specify the db type as REDSHIFT (instead of POSTGRESQL) and download the Amazon Redshift driver. We cannot include the JDBC driver into the Obevo package due to licensing reasons, so you will have to download and include it yourself. Instructions are here. (That link is not permanent; we will promote it to the main site once we do the release)

cgarlick commented 6 years ago

Brilliant, again thanks for the quick responses,we'll give this a proper try soon.

shantstepanian commented 6 years ago

Hi Craig, Fyi, I will cut a release sometime next week. Just want to see how your testing is going

shantstepanian commented 6 years ago

We have released 6.5.0 that has the Redshift changes.

Documentation link has been updated.

Redshift-specific docs are here.

I will close out this ticket, as we've committed the Redshift support; please reopen or create a new issue if you have any issues with usage.