dbt-labs / dbt-athena

The athena adapter plugin for dbt (https://getdbt.com)
https://dbt-athena.github.io
Apache License 2.0
225 stars 97 forks source link

Dash in schema name #162

Closed chrischin478 closed 1 year ago

chrischin478 commented 1 year ago

When a schema name has a dash in it, e.g. my-schema, the following error is thrown when running a model:

An error occurred (InvalidRequestException) when calling the StartQueryExecution operation: line 9:8: mismatched input '-'. Expecting: 'OR', 'SCHEMA', 'TABLE', 'VIEW'

When creating a table in the Athena UI the schema must be surrounded in back ticks to avoid this error, e.g. CREATE TABLE `my-schema`.table_name. I tried setting my profiles.yml like so:

schema: "`my-schema`"

but I then received a permission error:

[ErrorCategory:USER_ERROR, ErrorCode:PERMISSION_ERROR], Detail:Amazon Athena experienced a permission error. Please provide proper permission and submitting the query again...

Is there any way around this or this just isn't supported at all?

chrischin478 commented 1 year ago

This should be resolved by #152. I will confirm once the next release is deployed.

nicor88 commented 1 year ago

Not sure exactly when we will release, alternatively you can try out to install the main branch.

chrischin478 commented 1 year ago

Do you have a rough idea? Like less or more than a month? I'll try main until then.

On Mon, Feb 27, 2023, 4:01 PM nicor88 @.***> wrote:

Not sure exactly when we will release, alternatively you can try out to install the main branch.

— Reply to this email directly, view it on GitHub https://github.com/dbt-athena/dbt-athena/issues/162#issuecomment-1447082556, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALIDJQVIWDGL4YCHWLA4FTLWZUI3RANCNFSM6AAAAAAVHPXRWM . You are receiving this because you authored the thread.Message ID: @.***>

chrischin478 commented 1 year ago

I installed dbt from main and when executing dbt --version I see 1.4.3, yet I am still not able to create models with a schema name containing dashes. The same errors are encountered. I tried defining the profiles.yml file in three ways as follows:

schema: my-schema schema: "my-schema" schema: "`my-schema`"

Detailed output is:

23:38:45  Running with dbt=1.4.3
23:38:45  Unable to do partial parsing because profile has changed
23:38:47  Found 2 models, 4 tests, 0 snapshots, 0 analyses, 304 macros, 0 operations, 0 seed files, 0 sources, 0 exposures, 0 metrics
23:38:47  
Failed to execute query.
Traceback (most recent call last):
  File "/home/phelanc/python/dbt-venv/lib/python3.8/site-packages/pyathena/common.py", line 520, in _execute
    query_id = retry_api_call(
  File "/home/phelanc/python/dbt-venv/lib/python3.8/site-packages/pyathena/util.py", line 68, in retry_api_call
    return retry(func, *args, **kwargs)
  File "/home/phelanc/python/dbt-venv/lib/python3.8/site-packages/tenacity/__init__.py", line 379, in __call__
    do = self.iter(retry_state=retry_state)
  File "/home/phelanc/python/dbt-venv/lib/python3.8/site-packages/tenacity/__init__.py", line 314, in iter
    return fut.result()
  File "/usr/local/lib/python3.8/concurrent/futures/_base.py", line 437, in result
    return self.__get_result()
  File "/usr/local/lib/python3.8/concurrent/futures/_base.py", line 389, in __get_result
    raise self._exception
  File "/home/phelanc/python/dbt-venv/lib/python3.8/site-packages/tenacity/__init__.py", line 382, in __call__
    result = fn(*args, **kwargs)
  File "/home/phelanc/python/dbt-venv/lib/python3.8/site-packages/botocore/client.py", line 530, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/home/phelanc/python/dbt-venv/lib/python3.8/site-packages/botocore/client.py", line 960, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.errorfactory.InvalidRequestException: An error occurred (InvalidRequestException) when calling the StartQueryExecution operation: line 2:31: mismatched input '-'. Expecting: '.', 'WITH', <EOF>
23:38:50  
23:38:50  Finished running  in 0 hours 0 minutes and 3.31 seconds (3.31s).
23:38:50  Encountered an error:
Runtime Error
  Runtime Error
    An error occurred (InvalidRequestException) when calling the StartQueryExecution operation: line 2:31: mismatched input '-'. Expecting: '.', 'WITH', <EOF>
nicor88 commented 1 year ago

@chrischin478 I think that we might release withing the next weeks, there are enough contribution fixes.

Regarding this issue, have a read here

The only acceptable characters for database names, table names, and column names are lowercase letters, numbers, and the underscore character.

So for sure we could find some workarounds, but not sure that we should do it.

Couldn't you use the schema name directly in your model? the right working solution is escaping them with backticks.

chrischin478 commented 1 year ago

@nicor88 I agree that the recommended naming does not include dashes but I've inherited this and am trying to work around it. New schema names are not an option either. Although, there is at least one example of backticks being used in the AWS documentation here:

When using the Athena console query editor to drop a table that has special characters other than the underscore (_), use backticks, as in the following example. DROP TABLE `my-athena-database-01.my-athena-table`

Couldn't you use the schema name directly in your model?

I'm new to dbt so I'm trying to understand this recommendation. It's my understanding that the schema name for a model always comes from the profiles.yml file. Are you suggesting there's alternative way of passing the schema name for a new model?

To create a table where the schema has dashes the schema name must be double quoted, it cannot use backticks. The opposite is true when dropping a table, back ticks required and double quotes disallowed. Would implementing a solution for this be considered? Or maybe there's a macro I could set up to handle this?

nicor88 commented 1 year ago

Schema name doesn't necessary need to be set in the profiles.yml, it can be setup per model level using schema. Have a read here: https://docs.getdbt.com/docs/build/custom-schemas

The fact that creates statements and drop statements are not having consistent behavior on quoting makes implementing this a bit tricky. I'm not sure if a custom macro will help you here, as we are dealing with some adapters' internals, and the relations are escaped accordingly.

@Jrmyy do you think that we can implement a workaround to support this?

chrischin478 commented 1 year ago

I tried adding the schema config to the model directly, using the custom macro to prevent prefixing. As expected the same errors are encountered.

If it makes things simpler, when dropping a table the backticks don't need to be wrapped around both schema and table like in AWS's documentation, just the schema is necessary.

nicor88 commented 1 year ago

In your custom schema macro try to call render_hive() to your schema. That should help to escape the table properly at least on the ctas.

chrischin478 commented 1 year ago

I'm not sure exactly how to call render_hive(). In the macro below I tried replacing line {{ custom_schema_name | trim }} with the following but had no luck:

{{ custom_schema_name | trim }}.render_hive() render_hive({{ custom_schema_name | trim }})

{% macro generate_schema_name(custom_schema_name, node) -%}

    {%- set default_schema = target.schema -%}
    {%- if custom_schema_name is none -%}

        {{ default_schema }}

    {%- else -%}

        {{ custom_schema_name | trim }}

    {%- endif -%}

{%- endmacro %}
chrischin478 commented 1 year ago

Actually, success! I initialized a fresh dbt project and, with only setting the schema name in profiles.yaml, with no quotes or backticks, was able to create the two example models in a schema containing dashes in the name. I also confirmed that the generate_schema_name.sql macro works when defining a custom schema name config in a model. Thanks so much @nicor88!

nicor88 commented 1 year ago

@chrischin478 that's nice to hear, how did you fix that? Maybe we add a section in the Readme about this

chrischin478 commented 1 year ago

@nicor88 The latest version of dbt-athena-community adapter fixes the issue, so there shouldn't be a need to add any special instructions.

nicor88 commented 1 year ago

ah I thought that wasn't the case, so yeah, the rendering feature added fix that, let's close this then.