Closed cbini closed 3 months ago
Thanks, that's interesting. Should detect that schema already exists (logic here) Can you share the replication config you're using?
Here you go! I should add that I'm running these via Dagster, if that makes any difference. All that does is sling run
via the Python package.
source: ORACLE_POWERSCHOOL
target: BIGQUERY_TEAMSTER
defaults:
mode: full-refresh
object: "staging_{source_name}.{stream_table}"
streams:
students:
select:
- -custom
Ah I see stream students
doesn't have a schema. So you're probably defining that in your connection creds, correct?
Will test that on my side.
Yeah, that's right. I have a default schema set on the ORACLE_POWERSCHOOL
connection. Also, BIGQUERY_TEAMSTER
is required to have a dataset, but I'm overriding that in defaults.object
I'm not able to reproduce. Can you try running the replication with the CLI (not dagster)?
sling run -r /path/to/replication.yaml -d
Actually, nevermind. Just reproduced.
It's the staging_{source_name}
part in the object.
Will fix, thanks!
Trying Sling out for the first time, and I'm noticing that the first time I try to replicate a table, it attempts to create the schema twice. This results in BigQuery throwing this error:
googleapi: Error 409: Already Exists: Dataset teamster-332318:staging_oracle_powerschool, duplicate
However, it executes successfully on subsequent runs. It looks like it's trying to create the schema twice on the initial run.
Here are the relevant logs: