dbt-labs / dbt-external-tables

dbt macros to stage external sources
https://hub.getdbt.com/dbt-labs/dbt_external_tables/latest/
Apache License 2.0
294 stars 119 forks source link

Allowing Prefixes in the partitions section #175

Closed prachishah48 closed 1 year ago

prachishah48 commented 1 year ago

Describe the feature

We currently use google cloud storage to store our files which have multiple dates and file names(gs://myBucket/myTable/dt=YYYY-MM-DD/*.txt). We would ideally like to partition our files by _file_name (ideally the date inside the _file_name). Our GCS is set up so we have dt (default Hive partitioned layout) in the file path which can be picked up by the package but the dt does not always match the date inside the file name. For example, we will have a file named: name_type_20220701.txt and the dt will be 2021-09-02 because we are moving tables around for easier use. When I try:

partitions: name: _file_name data_type: string

I get this error:

Encountered an error while running operation: Database Error Invalid field name "_file_name". Field names are not allowed to start with the (case-insensitive) prefixes _PARTITION, TABLE, FILE, _ROW_TIMESTAMP, ROOT and _COLIDENTIFIER

Describe alternatives you've considered

We can ask our team to match the dt to the file name, which would be idea so we can take in the dt.

Additional context

This is a connection to BigQuery. Please let me know if this is BQ issue or something you can solve for.

Who will this benefit?

Anyone with multiple external tables with dates in the filename

github-actions[bot] commented 1 year ago

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.

github-actions[bot] commented 1 year ago

Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest. Just add a comment to notify the maintainers.