z3z1ma / target-bigquery

target-bigquery is a Singer target for BigQuery. It supports storage write, GCS, streaming, and batch load methods. Built with the Meltano SDK.
MIT License
27 stars 36 forks source link

overwrite=true causes errors #87

Closed srpwnd closed 3 months ago

srpwnd commented 4 months ago

When I set overwrite param to true in my setup, I get AttributeError: 'str' object has no attribute 'get' error. The full traceback log is here:

2024-05-27T12:55:19.468681Z [info     ] Traceback (most recent call last): cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
2024-05-27T12:55:19.468832Z [info     ]   File "/project/project/.meltano/loaders/target-bigquery/venv/bin/target-bigquery", line 8, in <module> cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
2024-05-27T12:55:19.468892Z [info     ]     sys.exit(TargetBigQuery.cli()) cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
2024-05-27T12:55:19.468945Z [info     ]   File "/project/project/.meltano/loaders/target-bigquery/venv/lib/python3.9/site-packages/click/core.py", line 1157, in __call__ cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
[2024-05-27, 12:55:19 UTC] {docker.py:391} INFO - 2024-05-27T12:55:19.468681Z [info     ] Traceback (most recent call last): cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
2024-05-27T12:55:19.469042Z [info     ]     return self.main(*args, **kwargs) cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
2024-05-27T12:55:19.469108Z [info     ]   File "/project/project/.meltano/loaders/target-bigquery/venv/lib/python3.9/site-packages/click/core.py", line 1078, in main cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
2024-05-27T12:55:19.469161Z [info     ]     rv = self.invoke(ctx)      cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
2024-05-27T12:55:19.469215Z [info     ]   File "/project/project/.meltano/loaders/target-bigquery/venv/lib/python3.9/site-packages/click/core.py", line 1434, in invoke cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
2024-05-27T12:55:19.469270Z [info     ]     return ctx.invoke(self.callback, **ctx.params) cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
2024-05-27T12:55:19.469322Z [info     ]   File "/project/project/.meltano/loaders/target-bigquery/venv/lib/python3.9/site-packages/click/core.py", line 783, in invoke cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
2024-05-27T12:55:19.469373Z [info     ]     return __callback(*args, **kwargs) cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
2024-05-27T12:55:19.469424Z [info     ]   File "/project/project/.meltano/loaders/target-bigquery/venv/lib/python3.9/site-packages/singer_sdk/target_base.py", line 572, in cli cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
2024-05-27T12:55:19.469497Z [info     ]     target = cls(  # type: ignore[operator] cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
2024-05-27T12:55:19.469552Z [info     ]   File "/project/project/.meltano/loaders/target-bigquery/venv/lib/python3.9/site-packages/target_bigquery/target.py", line 321, in __init__ cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
2024-05-27T12:55:19.469615Z [info     ]     super().__init__(*args, **kwargs) cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
2024-05-27T12:55:19.469665Z [info     ]   File "/project/project/.meltano/loaders/target-bigquery/venv/lib/python3.9/site-packages/singer_sdk/target_base.py", line 71, in __init__ cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
2024-05-27T12:55:19.469709Z [info     ]     super().__init__(          cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
2024-05-27T12:55:19.469764Z [info     ]   File "/project/project/.meltano/loaders/target-bigquery/venv/lib/python3.9/site-packages/singer_sdk/plugin_base.py", line 113, in __init__ cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
2024-05-27T12:55:19.469813Z [info     ]     self._validate_config(raise_errors=validate_config) cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
2024-05-27T12:55:19.469861Z [info     ]   File "/project/project/.meltano/loaders/target-bigquery/venv/lib/python3.9/site-packages/singer_sdk/plugin_base.py", line 242, in _validate_config cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
2024-05-27T12:55:19.469914Z [info     ]     errors = [e.message for e in validator.iter_errors(self._config)] cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
2024-05-27T12:55:19.469961Z [info     ]   File "/project/project/.meltano/loaders/target-bigquery/venv/lib/python3.9/site-packages/singer_sdk/plugin_base.py", line 242, in <listcomp> cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
[2024-05-27, 12:55:19 UTC] {docker.py:391} INFO - 2024-05-27T12:55:19.468892Z [info     ]     sys.exit(TargetBigQuery.cli()) cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
[2024-05-27, 12:55:19 UTC] {docker.py:391} INFO - 2024-05-27T12:55:19.469042Z [info     ]     return self.main(*args, **kwargs) cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
2024-05-27T12:55:19.470114Z [info     ]     errors = [e.message for e in validator.iter_errors(self._config)] cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
2024-05-27T12:55:19.470232Z [info     ]   File "/project/project/.meltano/loaders/target-bigquery/venv/lib/python3.9/site-packages/jsonschema/validators.py", line 384, in iter_errors cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
2024-05-27T12:55:19.470296Z [info     ]     for error in errors:       cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
2024-05-27T12:55:19.470351Z [info     ]   File "/project/project/.meltano/loaders/target-bigquery/venv/lib/python3.9/site-packages/singer_sdk/typing.py", line 149, in set_defaults cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
2024-05-27T12:55:19.470408Z [info     ]     yield from validate_properties( cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
2024-05-27T12:55:19.470467Z [info     ]   File "/project/project/.meltano/loaders/target-bigquery/venv/lib/python3.9/site-packages/jsonschema/_keywords.py", line 296, in properties cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
2024-05-27T12:55:19.470511Z [info     ]     yield from validator.descend( cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
2024-05-27T12:55:19.470565Z [info     ]   File "/project/project/.meltano/loaders/target-bigquery/venv/lib/python3.9/site-packages/jsonschema/validators.py", line 432, in descend cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
2024-05-27T12:55:19.470618Z [info     ]     for error in errors:       cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
2024-05-27T12:55:19.470754Z [info     ]   File "/project/project/.meltano/loaders/target-bigquery/venv/lib/python3.9/site-packages/jsonschema/_keywords.py", line 340, in anyOf cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
2024-05-27T12:55:19.470859Z [info     ]     errs = list(validator.descend(instance, subschema, schema_path=index)) cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
2024-05-27T12:55:19.470919Z [info     ]   File "/project/project/.meltano/loaders/target-bigquery/venv/lib/python3.9/site-packages/jsonschema/validators.py", line 421, in descend cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
2024-05-27T12:55:19.470976Z [info     ]     resolver = self._resolver.in_subresource( cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
[2024-05-27, 12:55:19 UTC] {docker.py:391} INFO - 2024-05-27T12:55:19.469108Z [info     ]   File "/project/project/.meltano/loaders/target-bigquery/venv/lib/python3.9/site-packages/click/core.py", line 1078, in main cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
[2024-05-27, 12:55:19 UTC] {docker.py:391} INFO - 2024-05-27T12:55:19.469161Z [info     ]     rv = self.invoke(ctx)      cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
[2024-05-27, 12:55:19 UTC] {docker.py:391} INFO - 2024-05-27T12:55:19.469270Z [info     ]     return ctx.invoke(self.callback, **ctx.params) cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
2024-05-27T12:55:19.471047Z [info     ]   File "/project/project/.meltano/loaders/target-bigquery/venv/lib/python3.9/site-packages/referencing/_core.py", line 694, in in_subresource cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
2024-05-27T12:55:19.471100Z [info     ]     id = subresource.id()      cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
2024-05-27T12:55:19.471153Z [info     ]   File "/project/project/.meltano/loaders/target-bigquery/venv/lib/python3.9/site-packages/referencing/_core.py", line 226, in id cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
2024-05-27T12:55:19.471205Z [info     ]     id = self._specification.id_of(self.contents) cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
2024-05-27T12:55:19.471259Z [info     ]   File "/project/project/.meltano/loaders/target-bigquery/venv/lib/python3.9/site-packages/referencing/jsonschema.py", line 56, in _legacy_dollar_id cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
2024-05-27T12:55:19.471311Z [info     ]     id = contents.get("$id")   cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
2024-05-27T12:55:19.471361Z [info     ] AttributeError: 'str' object has no attribute 'get' cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery

My meltano.yml contains this:

    - name: target-bigquery
      variant: z3z1ma
      config:
        add_metadata_columns: true

with rest of the config being passed as env variables:

TARGET_BIGQUERY_FLATTENING_ENABLED=true
TARGET_BIGQUERY_FLATTENING_MAX_DEPTH=15
TARGET_BIGQUERY_DATASET=prod
TARGET_BIGQUERY_PROJECT=<redacted>
TARGET_BIGQUERY_OVERWRITE=true

The error seems to bubble up from Meltano SDK schema validation, which might be related to how overwrite parameter is defined with th.CustomType, but I couldn't find enough information in the docs to properly understand it. I noticed the upsert parameter is implemented the same way and tested also setting it to true (and not the overwrite) and it behaves the same.

Is there something I'm missing in my configuration or environment that could cause this issue?

z3z1ma commented 3 months ago

I disabled the meltano jsonschema validation in main, this should be bundled in release. It should still log warnings but now won't make it unusable. Not sure why the above issue was happening but I think the validation is unnecessary. Any incorrect config will throw sooner rather than later and we don't subject ourselves to bugs in jsonschema validation outside our control.