This ended up being a bit more complicated than I wanted, but since I had all the types of the records defined in the dataclass, I didn't want to hard code them all a second time just for a BigQuery schema. So instead this uses the generate_schema function to automatically derive the table schema from the Record classes.
Was it worth doing this way? Probably not. But it works well, so I think it's worth keeping. This code can be re-used for future tables as well.
Checklist for reviewer:
[ ] Commits should reference a bug or github issue, if relevant (if a bug is
referenced, the pull request should include the bug number in the title)
[ ] Scan the PR and verify that no changes (particularly to
.circleci/config.yml) will cause environment variables (particularly
credentials) to be exposed in test logs
[ ] Ensure the container image will be using permissions granted to
telemetry-airflow
responsibly.
…emas if they don't exist
This ended up being a bit more complicated than I wanted, but since I had all the types of the records defined in the dataclass, I didn't want to hard code them all a second time just for a BigQuery schema. So instead this uses the
generate_schema
function to automatically derive the table schema from theRecord
classes.Was it worth doing this way? Probably not. But it works well, so I think it's worth keeping. This code can be re-used for future tables as well.
Checklist for reviewer:
.circleci/config.yml
) will cause environment variables (particularly credentials) to be exposed in test logs