airbytehq / airbyte

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
https://airbyte.com
Other
16.13k stars 4.12k forks source link

[source-mysql] deployment bug while setting up connection via mysql connector. Thanks #39332

Closed sluo2024 closed 4 months ago

sluo2024 commented 5 months ago

Connector Name

source-mysql

Connector Version

mysql 3.4.5 and 3.4.1

What step the error happened?

During the sync

Relevant information

we are setting up connection between mysql and snowflake. the airbyte server ip is 35.183.X.X and mysql source was setup properly. when we setup the connection and select the destination as snowflake. the following error message showed up during sync. Internal message: java.sql.SQLSyntaxErrorException: SELECT command denied to user ''@'10.24.1.50' for column 'name' in table 'all_trans' Failure origin: source Failure type: system_error first the username seems not tranfered and IP does not belong to any of our servers. so it is truely wield we recreated the source, destination and connection multiple times. and the error always shows up. other mysql DB sources we set up for comparison just worked fine. would you please help to advise your opinion? Thanks!

Here's some details for above from systems perspective: The airbyte server is located on AWS ec2 instance, the database in on prem. Airbyte accesses the server via NAT'ed port. I've manually connected using mysql client from airbyte instance and everything checks out - it lands on a proper DB server, the grants for the user allow it to SELECT against all tables. When we setup the source and click "test connection" I can see in 'show processlist' that airbyte is successfully connecting from airbyte server. When we setup snowflake as destination, it throws "SELECT command denied to user ''@'10.24.1.50' " which makes absolutely no sense. Why is the user empty ('') and why is the source host '10.24.1.50' when it should be 35.183.X.X (and I can literally see airbyte user connecting from that IP just before we get this error)

We were on 0.59.0 originally and patching to 0.61.0 didn't help

Relevant log output

Discovering schema failed
Something went wrong in the connector. See the logs for more details.

Learn more
Internal message: java.sql.SQLSyntaxErrorException: SELECT command denied to user ''@'10.24.1.50' for column 'name' in table 'all_trans'
Failure origin: source
Failure type: system_error

Stacktrace
    at com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:122)
    at com.mysql.cj.jdbc.StatementImpl.executeQuery(StatementImpl.java:1200)
    at com.mysql.cj.jdbc.DatabaseMetaData$2.forEach(DatabaseMetaData.java:2167)
    at com.mysql.cj.jdbc.DatabaseMetaData$2.forEach(DatabaseMetaData.java:2096)
    at com.mysql.cj.jdbc.IterateBlock.doForAll(IterateBlock.java:56)
    at com.mysql.cj.jdbc.DatabaseMetaData.getColumns(DatabaseMetaData.java:2273)
    at com.zaxxer.hikari.pool.ProxyDatabaseMetaData.getColumns(ProxyDatabaseMetaData.java:107)
    at com.zaxxer.hikari.pool.HikariProxyDatabaseMetaData.getColumns(HikariProxyDatabaseMetaData.java)
    at io.airbyte.cdk.integrations.source.jdbc.AbstractJdbcSource.discoverInternal$lambda$5(AbstractJdbcSource.kt:275)
    at io.airbyte.cdk.db.jdbc.DefaultJdbcDatabase.bufferedResultSetQuery(DefaultJdbcDatabase.kt:39)
    at io.airbyte.cdk.integrations.source.jdbc.AbstractJdbcSource.discoverInternal(AbstractJdbcSource.kt:273)
    at io.airbyte.cdk.integrations.source.jdbc.AbstractJdbcSource.discoverInternal(AbstractJdbcSource.kt:405)
    at io.airbyte.cdk.integrations.source.jdbc.AbstractJdbcSource.discoverInternal(AbstractJdbcSource.kt:86)
    at io.airbyte.cdk.integrations.source.relationaldb.AbstractDbSource.discoverWithoutSystemTables(AbstractDbSource.kt:313)
    at io.airbyte.cdk.integrations.source.relationaldb.AbstractDbSource.discover(AbstractDbSource.kt:95)
    at io.airbyte.integrations.source.mysql.MySqlSource.discover(MySqlSource.java:305)
    at io.airbyte.cdk.integrations.base.ssh.SshWrappedSource.discover$lambda$1(SshWrappedSource.kt:57)
    at io.airbyte.cdk.integrations.base.ssh.SshTunnel$Companion.sshWrap(SshTunnel.kt:550)
    at io.airbyte.cdk.integrations.base.ssh.SshWrappedSource.discover(SshWrappedSource.kt:57)
    at io.airbyte.cdk.integrations.base.IntegrationRunner.runInternal(IntegrationRunner.kt:169)
    at io.airbyte.cdk.integrations.base.IntegrationRunner.run(IntegrationRunner.kt:116)
    at io.airbyte.integrations.source.mysql.MySqlSource.main(MySqlSource.java:648)

Logs

2024-05-27 20:45:11 platform > Docker volume job log path: /tmp/workspace/767795ad-9d9f-4328-b3c1-80bcb30f9808/0/logs.log
2024-05-27 20:45:11 platform > Executing worker wrapper. Airbyte version: 0.61.0
2024-05-27 20:45:11 platform > Attempt 0 to save workflow id for cancellation
2024-05-27 20:45:11 platform > Using default value for environment variable SIDECAR_KUBE_CPU_LIMIT: '2.0'
2024-05-27 20:45:11 platform > Using default value for environment variable SOCAT_KUBE_CPU_LIMIT: '2.0'
2024-05-27 20:45:11 platform > Using default value for environment variable SIDECAR_KUBE_CPU_REQUEST: '0.1'
2024-05-27 20:45:11 platform > Using default value for environment variable SOCAT_KUBE_CPU_REQUEST: '0.1'
2024-05-27 20:45:11 platform > Checking if airbyte/source-mysql:3.4.1 exists...
2024-05-27 20:45:11 platform > airbyte/source-mysql:3.4.1 was found locally.
2024-05-27 20:45:11 platform > Creating docker container = source-mysql-discover-767795ad-9d9f-4328-b3c1-80bcb30f9808-0-yajvb with resources io.airbyte.config.ResourceRequirements@7dde1f35[cpuRequest=,cpuLimit=,memoryRequest=,memoryLimit=,additionalProperties={}] and allowedHosts null
2024-05-27 20:45:11 platform > Preparing command: docker run --rm --init -i -w /data/767795ad-9d9f-4328-b3c1-80bcb30f9808/0 --log-driver none --name source-mysql-discover-767795ad-9d9f-4328-b3c1-80bcb30f9808-0-yajvb --network host -v airbyte_workspace:/data -v oss_local_root:/local -e DEPLOYMENT_MODE=OSS -e WORKER_CONNECTOR_IMAGE=airbyte/source-mysql:3.4.1 -e AUTO_DETECT_SCHEMA=true -e LAUNCHDARKLY_KEY= -e SOCAT_KUBE_CPU_REQUEST=0.1 -e SOCAT_KUBE_CPU_LIMIT=2.0 -e FIELD_SELECTION_WORKSPACES= -e USE_STREAM_CAPABLE_STATE=true -e WORKER_ENVIRONMENT=DOCKER -e AIRBYTE_ROLE=dev -e APPLY_FIELD_SELECTION=false -e WORKER_JOB_ATTEMPT=0 -e OTEL_COLLECTOR_ENDPOINT=http://host.docker.internal:4317 -e FEATURE_FLAG_CLIENT=config -e AIRBYTE_VERSION=0.61.0 -e WORKER_JOB_ID=767795ad-9d9f-4328-b3c1-80bcb30f9808 airbyte/source-mysql:3.4.1 discover --config source_config.json
2024-05-27 20:45:11 platform > Reading messages from protocol version 0.2.0
2024-05-27 20:45:12 platform > INFO main i.a.i.s.m.MySqlSource(main):647 starting source: class io.airbyte.integrations.source.mysql.MySqlSource
2024-05-27 20:45:12 platform > INFO main i.a.c.i.b.IntegrationCliParser$Companion(parseOptions):145 integration args: {discover=null, config=source_config.json}
2024-05-27 20:45:12 platform > INFO main i.a.c.i.b.IntegrationRunner(runInternal):124 Running integration: io.airbyte.cdk.integrations.base.ssh.SshWrappedSource
2024-05-27 20:45:12 platform > INFO main i.a.c.i.b.IntegrationRunner(runInternal):125 Command: DISCOVER
2024-05-27 20:45:12 platform > INFO main i.a.c.i.b.IntegrationRunner(runInternal):126 Integration config: IntegrationConfig{command=DISCOVER, configPath='source_config.json', catalogPath='null', statePath='null'}
2024-05-27 20:45:13 platform > WARN main c.n.s.JsonMetaSchema(newValidator):278 Unknown keyword order - you should define your own Meta Schema. If the keyword is irrelevant for validation, just use a NonValidationKeyword
2024-05-27 20:45:13 platform > WARN main c.n.s.JsonMetaSchema(newValidator):278 Unknown keyword airbyte_secret - you should define your own Meta Schema. If the keyword is irrelevant for validation, just use a NonValidationKeyword
2024-05-27 20:45:13 platform > WARN main c.n.s.JsonMetaSchema(newValidator):278 Unknown keyword always_show - you should define your own Meta Schema. If the keyword is irrelevant for validation, just use a NonValidationKeyword
2024-05-27 20:45:13 platform > WARN main c.n.s.JsonMetaSchema(newValidator):278 Unknown keyword display_type - you should define your own Meta Schema. If the keyword is irrelevant for validation, just use a NonValidationKeyword
2024-05-27 20:45:13 platform > WARN main c.n.s.JsonMetaSchema(newValidator):278 Unknown keyword min - you should define your own Meta Schema. If the keyword is irrelevant for validation, just use a NonValidationKeyword

Contribute

marcosmarxm commented 5 months ago

I added this to the connector team backlog. What version of MySQL database are you using?

kashak88 commented 5 months ago

@marcosmarxm , I'm the sys admin dealing with this issue. All our servers are on

percona-server-server-5.7             5.7.44-48-1.focal
xiaohansong commented 5 months ago

thanks for detailed info you provided. I have a few questions and clarifications:

  1. IP of airbyte-server does not indicate airbyte initialize the query from that IP address. Query was initialized from a dynamically spawned docker container (or k8s pod) and that IP address is the initialization.
  2. It sounds like airbyte-server fails to read back user name/secrets when intialize "discover" job. Can you make sure airbyte-server and the entire stack (docker or k8s) has read and write access to the secret vault you choose to use?
kashak88 commented 5 months ago

Hello, 1) I realize that the query is initialized by a random docker container that spawns. To the database server, the IP would always be the same Public IP however. Regardless, on the DB, we can see the user connecting from airbyte IP and we know for a fact that the user has SELECT for basically whole server. 2) I am not aware of any secrets engine in use as the installation was basically: curl -o install script and run it. We have adjusted a single variable related to logging I believe and changed the default admin password.

i've inspected logs from all the pods and don't see anything of interest.

sluo2024 commented 5 months ago

@xiaohansong I am the original reporter, we encounter another similar issue for another connection (mysql to snowflake) the error msg as below: retested the source and the setup work. no idea where this username came from and why that speciifc IP. thanks

image

kashak88 commented 5 months ago

To add to above, it's one of our users. But we are using a different user, why is trying to do this with web_P2M?

sluo2024 commented 4 months ago

a third connection had the similar error

error message: Internal message: java.sql.SQLSyntaxErrorException: SELECT command denied to user ''@'10.34.1.106' for column 'pin' in table 'P2M_trans'

log:

2024-06-18 02:33:52 platform > Docker volume job log path: /tmp/workspace/a1bed750-27ab-4278-8036-d38425ca95c1/0/logs.log 2024-06-18 02:33:52 platform > Executing worker wrapper. Airbyte version: 0.61.0 2024-06-18 02:33:52 platform > Attempt 0 to save workflow id for cancellation 2024-06-18 02:33:52 platform > Using default value for environment variable SIDECAR_KUBE_CPU_LIMIT: '2.0' 2024-06-18 02:33:52 platform > Using default value for environment variable SOCAT_KUBE_CPU_LIMIT: '2.0' 2024-06-18 02:33:52 platform > Using default value for environment variable SIDECAR_KUBE_CPU_REQUEST: '0.1' 2024-06-18 02:33:52 platform > Using default value for environment variable SOCAT_KUBE_CPU_REQUEST: '0.1' 2024-06-18 02:33:52 platform > Checking if airbyte/source-mysql:3.4.5 exists... 2024-06-18 02:33:52 platform > airbyte/source-mysql:3.4.5 was found locally. 2024-06-18 02:33:52 platform > Creating docker container = source-mysql-discover-a1bed750-27ab-4278-8036-d38425ca95c1-0-qwfsf with resources io.airbyte.config.ResourceRequirements@4471f87a[cpuRequest=,cpuLimit=,memoryRequest=,memoryLimit=,additionalProperties={}] and allowedHosts null 2024-06-18 02:33:52 platform > Preparing command: docker run --rm --init -i -w /data/a1bed750-27ab-4278-8036-d38425ca95c1/0 --log-driver none --name source-mysql-discover-a1bed750-27ab-4278-8036-d38425ca95c1-0-qwfsf --network host -v airbyte_workspace:/data -v oss_local_root:/local -e DEPLOYMENT_MODE=OSS -e WORKER_CONNECTOR_IMAGE=airbyte/source-mysql:3.4.5 -e AUTO_DETECT_SCHEMA=true -e LAUNCHDARKLY_KEY= -e SOCAT_KUBE_CPU_REQUEST=0.1 -e SOCAT_KUBE_CPU_LIMIT=2.0 -e FIELD_SELECTION_WORKSPACES= -e USE_STREAM_CAPABLE_STATE=true -e WORKER_ENVIRONMENT=DOCKER -e AIRBYTE_ROLE=dev -e APPLY_FIELD_SELECTION=false -e WORKER_JOB_ATTEMPT=0 -e OTEL_COLLECTOR_ENDPOINT=http://host.docker.internal:4317 -e FEATURE_FLAG_CLIENT=config -e AIRBYTE_VERSION=0.61.0 -e WORKER_JOB_ID=a1bed750-27ab-4278-8036-d38425ca95c1 airbyte/source-mysql:3.4.5 discover --config source_config.json 2024-06-18 02:33:52 platform > Reading messages from protocol version 0.2.0 2024-06-18 02:33:54 platform > INFO main i.a.i.s.m.MySqlSource(main):638 starting source: class io.airbyte.integrations.source.mysql.MySqlSource 2024-06-18 02:33:54 platform > INFO main i.a.c.i.b.IntegrationCliParser$Companion(parseOptions):144 integration args: {discover=null, config=source_config.json} 2024-06-18 02:33:54 platform > INFO main i.a.c.i.b.IntegrationRunner(runInternal):124 Running integration: io.airbyte.cdk.integrations.base.ssh.SshWrappedSource 2024-06-18 02:33:54 platform > INFO main i.a.c.i.b.IntegrationRunner(runInternal):125 Command: DISCOVER 2024-06-18 02:33:54 platform > INFO main i.a.c.i.b.IntegrationRunner(runInternal):126 Integration config: IntegrationConfig{command=DISCOVER, configPath='source_config.json', catalogPath='null', statePath='null'} 2024-06-18 02:33:54 platform > WARN main c.n.s.JsonMetaSchema(newValidator):278 Unknown keyword order - you should define your own Meta Schema. If the keyword is irrelevant for validation, just use a NonValidationKeyword 2024-06-18 02:33:54 platform > WARN main c.n.s.JsonMetaSchema(newValidator):278 Unknown keyword airbyte_secret - you should define your own Meta Schema. If the keyword is irrelevant for validation, just use a NonValidationKeyword 2024-06-18 02:33:54 platform > WARN main c.n.s.JsonMetaSchema(newValidator):278 Unknown keyword always_show - you should define your own Meta Schema. If the keyword is irrelevant for validation, just use a NonValidationKeyword 2024-06-18 02:33:54 platform > WARN main c.n.s.JsonMetaSchema(newValidator):278 Unknown keyword display_type - you should define your own Meta Schema. If the keyword is irrelevant for validation, just use a NonValidationKeyword 2024-06-18 02:33:54 platform > WARN main c.n.s.JsonMetaSchema(newValidator):278 Unknown keyword min - you should define your own Meta Schema. If the keyword is irrelevant for validation, just use a NonValidationKeyword

sluo2024 commented 4 months ago

4th connection erro:

Internal message: java.sql.SQLSyntaxErrorException: SELECT command denied to user ''@'10.34.1.106' for column 'pin' in table 'P2M_trans' Failure origin: source Failure type: system_error

sluo2024 commented 4 months ago

new type of error for similar connection, can you help to clarify what we can change? @xiaohansong Internal message: java.sql.SQLException: Illegal mix of collations (utf8mb4_unicode_ci,IMPLICIT) and (utf8mb4_general_ci,IMPLICIT) for operation '=' Failure origin: source Failure type: system_error

error log:

at com.mysql.cj.jdbc.DatabaseMetaData$2.forEach(DatabaseMetaData.java:2167)
at com.mysql.cj.jdbc.DatabaseMetaData$2.forEach(DatabaseMetaData.java:2096)
at com.mysql.cj.jdbc.IterateBlock.doForAll(IterateBlock.java:56)
at com.mysql.cj.jdbc.DatabaseMetaData.getColumns(DatabaseMetaData.java:2273)
at com.zaxxer.hikari.pool.ProxyDatabaseMetaData.getColumns(ProxyDatabaseMetaData.java:107)
at com.zaxxer.hikari.pool.HikariProxyDatabaseMetaData.getColumns(HikariProxyDatabaseMetaData.java)
at io.airbyte.cdk.integrations.source.jdbc.AbstractJdbcSource.discoverInternal$lambda$5(AbstractJdbcSource.kt:279)
at io.airbyte.cdk.db.jdbc.DefaultJdbcDatabase.bufferedResultSetQuery(DefaultJdbcDatabase.kt:39)
at io.airbyte.cdk.integrations.source.jdbc.AbstractJdbcSource.discoverInternal(AbstractJdbcSource.kt:277)
at io.airbyte.cdk.integrations.source.jdbc.AbstractJdbcSource.discoverInternal(AbstractJdbcSource.kt:391)
at io.airbyte.cdk.integrations.source.jdbc.AbstractJdbcSource.discoverInternal(AbstractJdbcSource.kt:87)
at io.airbyte.cdk.integrations.source.relationaldb.AbstractDbSource.discoverWithoutSystemTables(AbstractDbSource.kt:302)
at io.airbyte.cdk.integrations.source.relationaldb.AbstractDbSource.discover(AbstractDbSource.kt:93)
at io.airbyte.integrations.source.mysql.MySqlSource.discover(MySqlSource.java:295)
at io.airbyte.cdk.integrations.base.ssh.SshWrappedSource.discover$lambda$1(SshWrappedSource.kt:58)
at io.airbyte.cdk.integrations.base.ssh.SshTunnel$Companion.sshWrap(SshTunnel.kt:545)
at io.airbyte.cdk.integrations.base.ssh.SshWrappedSource.discover(SshWrappedSource.kt:58)
at io.airbyte.cdk.integrations.base.IntegrationRunner.runInternal(IntegrationRunner.kt:169)
at io.airbyte.cdk.integrations.base.IntegrationRunner.run(IntegrationRunner.kt:116)
at io.airbyte.integrations.source.mysql.MySqlSource.main(MySqlSource.java:639)

2024-06-18 02:47:04 platform > Discover job subprocess finished with exit codee 1

sluo2024 commented 4 months ago

old issue 6th case:

Internal message: java.sql.SQLSyntaxErrorException: SELECT command denied to user ''@'10.34.1.102' for column 'pin' in table 'P2M_trans' Failure origin: source Failure type: system_error

xiaohansong commented 4 months ago

Some tips to debug:

  1. Can you connect to mysql and try run SHOW GRANTS for <airbyte_user>?
  2. Were you following https://docs.airbyte.com/integrations/sources/mysql#step-1-create-a-dedicated-read-only-mysql-user when creating such as user?

Also are these errors transient or it's consistent preventing you from establishing a connection?

sluo2024 commented 4 months ago

Hi Xiaohan,

The command output is :

"GRANT REPLICATION SLAVE, REPLICATION CLIENT ON . TO 'airbyte'@'%'" "GRANT SELECT, CREATE TEMPORARY TABLES ON manager.* TO 'airbyte'@'%'"

This is consistent error happening during the process to fetch the table metadata. and it happens a lot of times for 5 out of our 35+ connections. Thanks

kashak88 commented 4 months ago

This can be closed. It was a bad definer on a couple of VIEWs that was throwing a warning

theyueli commented 4 months ago

Closing this issue as it has been resolved (user does not have sufficient permission on views)