MeltanoLabs / target-athena

Singer.io Target for AWS Athena.
Other
5 stars 16 forks source link

Hyphen is not allowed in Athena table name #33

Closed yummydum closed 2 years ago

yummydum commented 2 years ago

When the input stream is DB, Meltano's naming convention for entity and attributes commonly has the form <DB name>-<table name>.<column name>, which includes a hyphen. The current implementation tries to create an Athena table with this naming convention, but hyphens are not allowed in the Athena table.

https://docs.aws.amazon.com/athena/latest/ug/glue-best-practices.html#schema-names

The only acceptable characters for database names, table names, and column names are lowercase letters, numbers, and the underscore character.

A quick fix may be converting hyphens to underscores when creating the DDL. What do you think?

yummydum commented 2 years ago

Probably, changing this line https://github.com/MeltanoLabs/target-athena/blob/ea2698028a68a5ecbbf7dc261deec5f472100356/target_athena/athena.py#L203

to

database=database.replace("-","_")

would suffice

pnadolny13 commented 2 years ago

Someone else mentioned this problem in Slack today https://meltano.slack.com/archives/C01TCRBBJD7/p1646661317815669. I think this is considered a bug and changing the naming convention shouldnt affect anyone that has functioning jobs running already. This is the target-snowflake example of this https://github.com/transferwise/pipelinewise-target-snowflake/blob/83761af19d57b09786114abab4b3afa6c15ef735/target_snowflake/db_sync.py#L362.

Seems like a small fix to get added. @aaronsteers @andrewcstewart what are your thoughts on this? I opened a draft PR which should fix this https://github.com/MeltanoLabs/target-athena/pull/42.

pnadolny13 commented 2 years ago

Closed by: https://github.com/MeltanoLabs/target-athena/pull/42