aws-samples / dbt-glue

This repository contains the dbt-glue adapter
Apache License 2.0
101 stars 69 forks source link

dbt seed failing when loading big seed files #446

Closed jausanca closed 1 month ago

jausanca commented 1 month ago

Describe the bug

dbt seed fails when trying to load a big seed file

Steps To Reproduce

Try to load a seed file with over 500 rows (could easily be more or less depending on columns number). It should be around 68000 characters when serializing to json array.

Expected behavior

The content of the seed file should be loaded to a glue table

Screenshots and log output

Value '
...
<long statement code>
...
' at 'code' failed to satisfy constraint: Member must have length less than or equal to 68000

System information

The output of dbt --version:

Core:
  - installed: 1.8.6
  - latest:    1.8.6 - Up to date!

Plugins:
  - spark: 1.8.0 - Up to date!
  - glue:  1.8.1 - Up to date!

The output of python --version:

Python 3.10.12

Additional context

The error occurs when trying to upload large files because it tries to upload the whole content of the csv on a single statement (surpassing the max characters allowed on the boto3 run_statement method), could be solved by appending csv data in chunks on multiple statement executions.