goccy / bigquery-emulator

BigQuery emulator server implemented in Go
MIT License
799 stars 107 forks source link

Add compatibility with the BigQuery JDBC driver #7

Closed kaklakariada closed 2 years ago

kaklakariada commented 2 years ago

Hi @goccy,

We use the BigQuery JDBC driver (see documentation) in our project and would like to add integration tests using bigquery-emulator.

It's possible to specify a custom endpoint for the driver using the undocumented RootURL argument, e.g.:

jdbc:bigquery://http://localhost/:9050;RootURL=http://localhost:9050;ProjectId=test;OAuthType=2;OAuthAccessToken=<token>

During our tests we found the following issues caused by the JDBC driver's behavior. We would like to contribute to your project and fix these issues if that's ok for you.

Path prefix

The driver always uses path prefix /bigquery/v2, e.g. POST /bigquery/v2/projects/test/queries. This is not configurable. We propose to add a command line switch that allows overriding the default prefix "", e.g.:

--path-prefix="/bigquery/v2"

Content-Encoding gzip

The driver sends payload with Content-Encoding: gzip. This is not configurable. We propose to add a middleware function to server.go that unzips the request payload if necessary. This would require adding an additional dependency, e.g. github.com/klauspost/compress/gzip.

Job ID

When sending a SQL query, the JDBC driver sends request POST /bigquery/v2/projects/test/queries with the following payload:

{"dryRun":false,"maxResults":10000,"query":"select 2*5","timeoutMs":10000,"useLegacySql":false,"useQueryCache":true}

The emulator replies with this response:

{"jobComplete":true,"jobReference":{"projectId":"test"},"rows":[{"f":[{"v":"10"}]}],"schema":{"fields":[{"name":"$col1","type":"INTEGER"}]},"totalRows":"1"}

This causes the following exception in the JDBC driver which is caused by the missing jobId in jobReference.

java.sql.SQLException: [Simba][BigQueryJDBCDriver](100030) Error trying to obtain Google Bigquery object.
    at com.simba.googlebigquery.googlebigquery.dataengine.BQResultSet.<init>(Unknown Source)
    at com.simba.googlebigquery.googlebigquery.dataengine.BQSQLExecutor.execute(Unknown Source)
    at com.simba.googlebigquery.jdbc.common.SStatement.executeNoParams(Unknown Source)
    at com.simba.googlebigquery.jdbc.common.BaseStatement.executeQuery(Unknown Source)
    at com.exasol.adapter.dialects.bigquery.BigQueryJdbcTest.test(BigQueryJdbcTest.java:23)

The emulator uses r.queryRequest.RequestId as jobId (see handler.go) which is nil in this case.

We propose to generate a random ID in case r.queryRequest.RequestId is nil.

If these changes are OK for you, we would like create separate pull requests for each of these issues, or discuss how to solve them in a different way.

Thank you very much for your support!

goccy commented 2 years ago

Thank you for your request. First, path prefix will resolve with this fixes . https://github.com/goccy/bigquery-emulator/pull/8/files#diff-78f42ba40d0f10b08c73b7e6bb8376f398e249c963cf549e89591d6b6826b9a4R56

Second, Content-Encoding gzip will be supported next PR ! Thanks :)

kaklakariada commented 2 years ago

Thank you for your response! It's good to hear that you are already working on this. There are some more functions not implemented yet, e.g. listing tables. But I guess you are also working on this. So I will close this issue for now and try it again later :-)