Closed courentin closed 2 years ago
@Tomme This PR is ready for review (you can review or commit by commit), it can be tested by installing dbt-athena from my branch: pip install git+https://github.com/courentin/dbt-athena.git@upgrade-dbt-1.0.0
and by following the 1.0.0 migration guide.
Hi! I installed this and dbt test ran flawlessly. When I tried dbt run I got an error:
An error occurred (InvalidRequestException) when calling the StartQueryExecution operation: line 2:74: mismatched input '"my_model_name__dbt_backup"' expecting {'SELECT', 'FROM', etc.
I looked at the target/run code and saw that it was creating a my_model_name__dbt_tmp table but there was no mention of a __dbt_backup table. This is a regular table model.
@BeantownData thank you for testing it!
Concerning the __dbt_backup
table, it seems dbt is creating it from the current table, see https://github.com/dbt-labs/dbt-core/issues/3453#issuecomment-861525096
On my side I can run dbt run
flawlessly, could you share what the generated code from my_model_name
looks like?
[UPDATE] I have been able to reproduce it
I was able to run this version of dbt-athena successfully on a pretty big project!
@BeantownData could you try to install the last version with pip install --force-reinstall git+https://github.com/courentin/dbt-athena.git@618d8d8e829d6ad7323f72e85b23bc1ba3f32a4b
and try again?
Sadly nope.
I created a new test model for this:
select 'foo' as bar
I get this output:
dbt run -m test_model
13:16:17 [WARNING]: Deprecated functionality
The source-paths
config has been renamed to model-paths
. Please update your
dbt_project.yml
configuration to reflect this change.
13:16:17 [WARNING]: Deprecated functionality
The data-paths
config has been renamed to seed-paths
. Please update your
dbt_project.yml
configuration to reflect this change.
13:16:17 Running with dbt=1.0.0
13:16:18 Found 219 models, 823 tests, 0 snapshots, 0 analyses, 357 macros, 0 operations, 0 seed files, 212 sources, 0 exposures, 0 metrics
13:16:18
13:16:45 Concurrency: 20 threads (target='jalbro')
13:16:45
13:16:45 1 of 1 START table model jalbro_stage.test_model................................ [RUN]
Failed to execute query.
Traceback (most recent call last):
File "c:\python38\lib\site-packages\pyathena\common.py", line 307, in _execute
query_id = retry_api_call(
File "c:\python38\lib\site-packages\pyathena\util.py", line 84, in retry_api_call
return retry(func, *args, *kwargs)
File "c:\python38\lib\site-packages\tenacity__init.py", line 423, in call
do = self.iter(retry_state=retry_state)
File "c:\python38\lib\site-packages\tenacity__init__.py", line 360, in iter
return fut.result()
File "c:\python38\lib\concurrent\futures_base.py", line 432, in result
return self.get_result()
File "c:\python38\lib\concurrent\futures_base.py", line 388, in get_result
raise self._exception
File "c:\python38\lib\site-packages\tenacity\init.py", line 426, in call__
result = fn(args, **kwargs)
File "c:\python38\lib\site-packages\botocore\client.py", line 276, in _api_call
return self._make_api_call(operation_name, kwargs)
File "c:\python38\lib\site-packages\botocore\client.py", line 586, in _make_api_call
raise error_class(parsed_response, operation_name)
botocore.errorfactory.InvalidRequestException: An error occurred (InvalidRequestException) when calling the StartQueryExecution operation: line 2:56: mismatched input '"test_model"' expecting {'SELECT', 'FROM', 'ADD', 'AS', 'ALL', 'DISTINCT', 'WHERE', 'GROUP', 'BY', 'GROUPING', 'SETS', 'CUBE', 'ROLLUP', 'ORDER', 'HAVING', 'LIMIT', 'AT', 'OR', 'AND', 'IN', NOT, 'NO', 'EXISTS', 'BETWEEN', 'LIKE', RLIKE, 'IS', 'NULL', 'TRUE', 'FALSE', 'NULLS', 'ASC', 'DESC', 'FOR', 'INTERVAL', 'CASE', 'WHEN', 'THEN', 'ELSE', 'END', 'JOIN', 'CROSS', 'OUTER', 'INNER', 'LEFT', 'SEMI', 'RIGHT', 'FULL', 'NATURAL', 'ON', 'LATERAL', 'WINDOW', 'OVER', 'PARTITION', 'RANGE', 'ROWS', 'UNBOUNDED', 'PRECEDING', 'FOLLOWING', 'CURRENT', 'ROW', 'WITH', 'VALUES', 'CREATE', 'TABLE', 'VIEW', 'REPLACE', 'INSERT', 'DELETE', 'INTO', 'DESCRIBE', 'EXPLAIN', 'FORMAT', 'LOGICAL', 'CODEGEN', 'CAST', 'SHOW', 'TABLES', 'COLUMNS', 'COLUMN', 'USE', 'PARTITIONS', 'FUNCTIONS', 'DROP', 'UNION', 'EXCEPT', 'INTERSECT', 'TO', 'TABLESAMPLE', 'STRATIFY', 'ALTER', 'RENAME', 'ARRAY', 'MAP', 'STRUCT', 'COMMENT', 'SET', 'RESET', 'DATA', 'START', 'TRANSACTION', 'COMMIT', 'ROLLBACK', 'MACRO', 'IF', 'DIV', 'PERCENT', 'BUCKET', 'OUT', 'OF', 'SORT', 'CLUSTER', 'DISTRIBUTE', 'OVERWRITE', 'TRANSFORM', 'REDUCE', 'USING', 'SERDE', 'SERDEPROPERTIES', 'RECORDREADER', 'RECORDWRITER', 'DELIMITED', 'FIELDS', 'TERMINATED', 'COLLECTION', 'ITEMS', 'KEYS', 'ESCAPED', 'LINES', 'SEPARATED', 'FUNCTION', 'EXTENDED', 'REFRESH', 'CLEAR', 'CACHE', 'UNCACHE', 'LAZY', 'FORMATTED', TEMPORARY, 'OPTIONS', 'UNSET', 'TBLPROPERTIES', 'DBPROPERTIES', 'BUCKETS', 'SKEWED', 'STORED', 'DIRECTORIES', 'LOCATION', 'EXCHANGE', 'ARCHIVE', 'UNARCHIVE', 'FILEFORMAT', 'TOUCH', 'COMPACT', 'CONCATENATE', 'CHANGE', 'CASCADE', 'RESTRICT', 'CLUSTERED', 'SORTED', 'PURGE', 'INPUTFORMAT', 'OUTPUTFORMAT', DATABASE, DATABASES, 'DFS', 'TRUNCATE', 'ANALYZE', 'COMPUTE', 'LIST', 'STATISTICS', 'PARTITIONED', 'EXTERNAL', 'DEFINED', 'REVOKE', 'GRANT', 'LOCK', 'UNLOCK', 'MSCK', 'REPAIR', 'EXPORT', 'IMPORT', 'LOAD', 'ROLE', 'ROLES', 'COMPACTIONS', 'PRINCIPALS', 'TRANSACTIONS', 'INDEX', 'INDEXES', 'LOCKS', 'OPTION', 'ANTI', 'LOCAL', 'INPATH', IDENTIFIER, BACKQUOTED_IDENTIFIER}
Failed to execute query.
The run version of the query is:
create table awsdatacatalog.jalbro_stage.test_model__dbt_tmp as ( select 'foo' as bar );
When I tried to run the command manually in the athena console I found that the test_model__dbt_tmp table had been created by dbt.
Hope this helps! Thanks!
Is there a way I can find the transaction wrapper code for this? I would have thought it would be part of the run code.
I'm wondering if this could possibly a permissions issue? I could verify if I had the code that was actually run to swap the temp table, backup table, and real table.
Actually this error comes from the fact that dbt macros were moved to some subdirectories. When installing this adapter, these macros were not copied to the dbt package properly, https://github.com/Tomme/dbt-athena/pull/52/commits/097cb655dfff4589850102ec656b464cb79597f6 fixes that.
I think the command I gave you to reinstall dbt-athena
is not enough
Can you tell me if the result of:
ls -l $(pip show dbt-core | grep Location | sed 's/Location: //g')/dbt/include/athena/macros
shows two directories adapters
and materializations
and no yaml files?
If you have other things that these two folders, it probably means that you need to uninstall both dbt-core
and dbt-athena
before reinstalling it.
I couldn't run that snippet as I'm on a windows box.
But I uninstalled dbt-core and dbt-athena and it worked! 218 tables created and 823 tests passed. Thanks!
Do we need more testers for this? Should we post on the dbt slack athena channel to get more?
Hello @Tomme I hope you're doing well!
Would you have the bandwidth to review this PR in the near future? If no, can someone else do it on your behalf?
Thank you :)
@courentin I tested this and worked great! no problems upgrading from a project previously using dbt 0.20 🎉🎉🎉🚀🚀🚀
I'm wondering about schema evolution BTW... is this supported? like if when I do a full-refresh
instead of deleting the glue table and creating a new one it would create a new schema version 🤔
I might be wrong as I don't use this feature but it does not seem to be supported by the adapter see https://github.com/Tomme/dbt-athena/issues/47
Great work guys! Can this become a release soon? The reason we're choosing dbt beyond its other features is the Athena plugin, on which has a central usage in our eco-system. Would be lovely to be aligned to latest dbt 1.x when possible.
:partying_face: :+1: :100: yeah!
PR to fix #51
For beta testers, you can test it out by:
If you encounter any issue like this one: https://github.com/Tomme/dbt-athena/pull/52#issuecomment-1003027561, try to uninstall dbt and dbt-athena first and reinstall dbt-athena.
5 users haven been able to install and use this version successfully.