snowflakedb / snowflake-connector-python

Snowflake Connector for Python
https://pypi.python.org/pypi/snowflake-connector-python/
Apache License 2.0
568 stars 456 forks source link

SNOW-1410719: AuthByKeyPair.prepare() takes 1 positional argument but 3 were given #1945

Closed mmyers5 closed 1 month ago

mmyers5 commented 1 month ago

Python version

3.10.7

Operating system and processor architecture

Ubuntu 22.04.4

Installed packages

asn1crypto==1.5.1
attrs==23.2.0
boto3==1.26.44
botocore==1.29.165
certifi==2024.2.2
cffi==1.16.0
charset-normalizer==2.1.1
cryptography==40.0.2
filelock==3.14.0
hypothesis==6.98.17
idna==3.7
iniconfig==2.0.0
jmespath==1.0.1
oscrypto==1.3.0
packaging==24.0
pluggy==1.5.0
pycparser==2.22
pycryptodomex==3.20.0
PyJWT==2.8.0
pyOpenSSL==23.2.0
pytest==8.0.2
python-dateutil==2.9.0.post0
pytz==2024.1
requests==2.31.0
s3transfer==0.6.2
six==1.16.0
snowflake-connector-python==3.0.0
sortedcontainers==2.4.0
typing_extensions==4.11.0
urllib3==1.26.18

What did you do?

I originally created this issue here at dbt-snowflake. They recommended I create the issue here.

We are using docker compose in GitHub Actions to generate dbt docs and publish them to an external storage location. When the CI/CD job runs docker compose run dbt docs generate --target nondev, it randomly fails with the error: AuthByKeyPair.prepare() takes 1 positional argument but 3 were given. Sometimes the CI/CD job succeeds, and I haven't been able to get the failure to trigger locally.

It is worth noting that the following CI/CD job always succeeds with the command: dbt --warn-error debug --target nondev.

  1. Here are the environment variables that exist:
    • SNOWFLAKE_ACCOUNT
    • SNOWFLAKE_USER
    • SNOWFLAKE_AUTHENTICATOR=snowflake_jwt
    • SNOWFLAKE_RSA_KEY_SECRET
    • SNOWFLAKE_RSA_KEY_PASSWORD
    • SNOWFLAKE_ROLE
    • SNOWFLAKE_DATABASE
    • SNOWFLAKE_WAREHOUSE
    • SNOWFLAKE_TARGET_SCHEMA
  2. This is the profile that we are using:
    nondev:
      type: snowflake
      account: "{{ env_var('SNOWFLAKE_ACCOUNT') }}"
      user: "{{ env_var('SNOWFLAKE_USER') }}"
      authenticator: "{{ env_var('SNOWFLAKE_AUTHENTICATOR', 'snowflake') }}"
      password: "{{ env_var('SNOWFLAKE_PASSWORD', None) }}"
      private_key: "{{ env_var('SNOWFLAKE_RSA_KEY_SECRET', '') }}"
      private_key_passphrase: "{{ env_var('SNOWFLAKE_RSA_KEY_PASSWORD', '') }}"
      role: "{{ env_var('SNOWFLAKE_ROLE') }}"
      database: "{{ env_var('SNOWFLAKE_DATABASE') }}"
      warehouse: "{{ env_var('SNOWFLAKE_WAREHOUSE') }}"
      schema: "{{ env_var('SNOWFLAKE_TARGET_SCHEMA', 'dev') }}"
      threads: 8 # The number of models that can run concurrently in production
      client_session_keep_alive: False
      # optional
      connect_retries: 0 # default 0
      connect_timeout: 10 # default: 10
      retry_on_database_errors: False # default: false 
      retry_all: False  # default: false
  3. Here is the docker image that we are using: ghcr.io/dbt-labs/dbt-snowflake:1.7.3. We have an entrypoint that prefixes any command being passed into the docker container with dbt, so docker compose run dbt docs will run dbt docs.
  4. Upstream job that is being run in CI/CD successfully:
    • docker compose build dbt
    • docker compose run dbt --warn-error parse --target nondev
    • docker compose run dbt --warn-error debug --target nondev
  5. Steps of the job that is failing:
    • docker compose build dbt
    • docker compose run dbt docs generate --target nondev

Relevant log output

On a failed run:

18:41:20  Running with dbt=1.7.11
18:41:21  Registered adapter: snowflake=1.7.3
18:41:21  Unable to do partial parsing because saved manifest not found. Starting full parse.
18:41:32  Found 420 models, 2 analyses, 154 tests, 19 seeds, 351 sources, 33 exposures, 0 metrics, 1035 macros, 0 groups, 0 semantic models
18:41:32  
18:41:43  Encountered an error:
Database Error
  AuthByKeyPair.prepare() takes 1 positional argument but 3 were given

On a successful run:

18:15:02  Running with dbt=1.7.11
18:15:03  Registered adapter: snowflake=1.7.3
18:15:03  Unable to do partial parsing because saved manifest not found. Starting full parse.
18:15:14  Found 420 models, 2 analyses, 154 tests, 19 seeds, 351 sources, 33 exposures, 0 metrics, 1035 macros, 0 groups, 0 semantic models
18:15:14  
18:16:15  Concurrency: 8 threads (target='nondev')
18:16:15  
18:16:39  Building catalog
18:16:55  Catalog written to /usr/app/dbt/target/catalog.json

What did you expect to see?

No flaky job results

Can you set logging to DEBUG and collect the logs?

Added the --debug flag to the job:

17:39:18  On list_my_db_dsg: /* {"app": "dbt", "dbt_version": "1.7.11", "profile_name": "gc_snowflake", "target_name": "nondev", "connection_name": "list_my_db_dsg"} */
show terse objects in my_db.dsg limit 10000
17:39:18  Acquiring new snowflake connection 'list_my_db_partnerships'
17:39:18  Opening a new connection, currently in state init
17:39:18  Using snowflake connection "list_my_db_partnerships"
17:39:18  Using snowflake connection "list_my_db_metadata"
17:39:18  On list_my_db_metadata: /* {"app": "dbt", "dbt_version": "1.7.11", "profile_name": "gc_snowflake", "target_name": "nondev", "connection_name": "list_my_db_metadata"} */
show terse objects in my_db.metadata limit 10000
17:39:18  Acquiring new snowflake connection 'list_my_db_external'
17:39:18  Opening a new connection, currently in state init
17:39:18  Using snowflake connection "list_my_db_external"
17:39:18  On list_my_db_partnerships: /* {"app": "dbt", "dbt_version": "1.7.11", "profile_name": "gc_snowflake", "target_name": "nondev", "connection_name": "list_my_db_partnerships"} */
show terse objects in my_db.partnerships limit 10000
17:39:18  On list_my_db_external: /* {"app": "dbt", "dbt_version": "1.7.11", "profile_name": "gc_snowflake", "target_name": "nondev", "connection_name": "list_my_db_external"} */
show terse objects in my_db.external limit 10000
17:39:18  Opening a new connection, currently in state init
17:39:18  Opening a new connection, currently in state init
17:39:18  Acquiring new snowflake connection 'list_my_db_ltv'
17:39:18  Acquiring new snowflake connection 'list_my_db_gc'
17:39:19  Using snowflake connection "list_my_db_ltv"
17:39:19  Using snowflake connection "list_my_db_gc"
17:39:19  On list_my_db_ltv: /* {"app": "dbt", "dbt_version": "1.7.11", "profile_name": "gc_snowflake", "target_name": "nondev", "connection_name": "list_my_db_ltv"} */
show terse objects in my_db.ltv limit 10000
17:39:19  Opening a new connection, currently in state init
17:39:19  On list_my_db_gc: /* {"app": "dbt", "dbt_version": "1.7.11", "profile_name": "gc_snowflake", "target_name": "nondev", "connection_name": "list_my_db_gc"} */
show terse objects in my_db.gc limit 10000
17:39:19  Opening a new connection, currently in state init
17:39:19  Snowflake adapter: Error running SQL: /* {"app": "dbt", "dbt_version": "1.7.11", "profile_name": "gc_snowflake", "target_name": "nondev", "connection_name": "list_my_db_metadata"} */
show terse objects in my_db.metadata limit 10000
17:39:19  Snowflake adapter: Rolling back transaction.
17:39:19  Snowflake adapter: Error running SQL: macro list_relations_without_caching
17:39:19  Snowflake adapter: Rolling back transaction.
17:39:19  On list_my_db_metadata: No close available on handle
17:39:19  Re-using an available connection from the pool (formerly list_my_db_metadata, now list_my_db_base)
17:39:19  Using snowflake connection "list_my_db_base"
17:39:19  On list_my_db_base: /* {"app": "dbt", "dbt_version": "1.7.11", "profile_name": "gc_snowflake", "target_name": "nondev", "connection_name": "list_my_db_base"} */
show terse objects in my_db.base limit 10000
17:39:19  Opening a new connection, currently in state closed
17:39:19  SQL status: SUCCESS 19 in 1.0 seconds
17:39:19  On list_my_db_external: Close
17:39:19  SQL status: SUCCESS 30 in 1.0 seconds
17:39:19  On list_my_db_partnerships: Close
17:39:19  SQL status: SUCCESS 71 in 1.0 seconds
17:39:19  On list_my_db_intermediate: Close
17:39:19  SQL status: SUCCESS 7 in 1.0 seconds
17:39:19  On list_my_db_ltv: Close
17:39:19  SQL status: SUCCESS 153 in 1.0 seconds
17:39:19  On list_my_db_raw_events: Close
17:39:19  Re-using an available connection from the pool (formerly list_my_db_external, now list_my_db_tm)
17:39:19  Using snowflake connection "list_my_db_tm"
17:39:19  On list_my_db_tm: /* {"app": "dbt", "dbt_version": "1.7.11", "profile_name": "gc_snowflake", "target_name": "nondev", "connection_name": "list_my_db_tm"} */
show terse objects in my_db.tm limit 10000
17:39:19  Opening a new connection, currently in state closed
17:39:20  SQL status: SUCCESS 106 in 1.0 seconds
17:39:20  On list_my_db_gc: Close
17:39:20  SQL status: SUCCESS 129 in 0.0 seconds
17:39:20  On list_my_db_base: Close
17:39:20  SQL status: SUCCESS 44 in 0.0 seconds
17:39:20  On list_my_db_tm: Close
17:39:20  SQL status: SUCCESS 5 in 2.0 seconds
17:39:20  On list_my_db_dsg: Close
17:39:20  Connection 'master' was properly closed.
17:39:20  Connection 'list_my_db_raw_events' was properly closed.
17:39:20  Connection 'list_my_db_intermediate' was properly closed.
17:39:20  Connection 'list_my_db_dsg' was properly closed.
17:39:20  Connection 'list_my_db_partnerships' was properly closed.
17:39:20  Connection 'list_my_db_base' was properly closed.
17:39:20  Connection 'list_my_db_tm' was properly closed.
17:39:20  Connection 'list_my_db_ltv' was properly closed.
17:39:20  Connection 'list_my_db_gc' was properly closed.
17:39:20  Encountered an error:
Database Error
  AuthByKeyPair.prepare() takes 1 positional argument but 3 were given
17:39:20  Resource report: {"command_name": "generate", "command_wall_clock_time": 13.463965, "process_user_time": 14.95133, "process_kernel_time": 0.287372, "process_mem_max_rss": "196324", "process_out_blocks": "13528", "command_success": false, "process_in_blocks": "0"}
17:39:20  Command `dbt docs generate` failed at 17:39:20.917144 after 13.46 seconds
17:39:20  Sending event: {'category': 'dbt', 'action': 'invocation', 'label': 'end', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x7ff218d9b340>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x7ff210b64d00>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x7ff2109a2830>]}
17:39:20  Flushing usage events
mmyers5 commented 1 month ago

We found the issue! In our Docker image that containerized the dbt CLI, we were installing our own version of snowflake-connector-python. Not sure how that led to flaky behavior within the same container but 🤷‍♀️ resolving this seems to have fixed it. Thanks y'all