MeltanoLabs / tap-bigquery

Other
1 stars 3 forks source link

Support configuring a destination for large result sets #18

Open edgarrmondragon opened 8 months ago

edgarrmondragon commented 8 months ago

BigQuery queries have a limited response size^1 so syncs may fail when a large response is generated.

The python-bigquery-sqlalchemy^2 library supports passing a destination query parameter so the fix for this probably involves adding a new setting (e.g. destination_table) and passing that to the SQLAlchemy URL construction in

https://github.com/MeltanoLabs/tap-bigquery/blob/0e37a0fe3e28d85628e60d8e8928e76eff862e1b/tap_bigquery/connector.py#L45-L47

The string in question is a fully qualified table, e.g. different-project.different-dataset.table.

AlejandroUPC commented 4 months ago

Hello @edgarrmondragon I will happily work on this I am just a bit lost and I do not really understand the issue, how would destination solve this? Also tagging @pnadolny13 to understand why it was decided to go w/ sqlalchemy instead of using standard google.cloud bigquery client?

edgarrmondragon commented 4 months ago

Hello @edgarrmondragon I will happily work on this I am just a bit lost and I do not really understand the issue

@AlejandroUPC You might wanna read through https://cloud.google.com/bigquery/docs/writing-results#large-results.

Have you run into this issue?

ReubenFrankel commented 4 weeks ago

Would #24 cover this?