laughingman7743 / PyAthena

PyAthena is a Python DB API 2.0 (PEP 249) client for Amazon Athena.
MIT License
456 stars 102 forks source link

Issue 517: Concept for Timetravel/Temporal Queries #518

Closed jasonamyers closed 3 months ago

jasonamyers commented 3 months ago

Resolves: https://github.com/laughingman7743/PyAthena/issues/517 Using approach called out a SQLAlchemy discussion: https://github.com/sqlalchemy/sqlalchemy/discussions/11161

For Version with Alias example

engine = get_engine('athena')
table = build_table(engine, 'my_table', 'my_schema')
ta = table.alias()
version = 8860342768238431084
query = select(func.count(ta.c.record_id)).with_hint(ta, f'FOR VERSION AS OF {version}')
result = str(query.compile(compile_kwargs={"literal_binds": True}, dialect=AthenaDialect()))

Output:

SELECT count(my_table.record_id) AS count_1 
FROM my_schema.my_table FOR VERSION AS OF 8860342768238431084 AS my_table_1
laughingman7743 commented 3 months ago

You can format the code with make fmt.

jasonamyers commented 3 months ago

@laughingman7743 Thank you so much for looking at this!

jasonamyers commented 3 months ago

@laughingman7743 Thanks for the additional feedback updated and formatted.

laughingman7743 commented 3 months ago

Thank you. I will check to see if all tests pass.

jasonamyers commented 3 months ago

Thank you. I will check to see if all tests pass.

Awesome thanks, I didn’t have a good environment to run the tests.

laughingman7743 commented 3 months ago

If you fix what I commented, all tests will pass.

============================================================================================================================================================================================================================== test session starts ===============================================================================================================================================================================================================================
platform darwin -- Python 3.11.7, pytest-8.0.0, pluggy-1.4.0
rootdir: /Users/foobar/github/PyAthena
configfile: pyproject.toml
plugins: cov-4.1.0, dependency-0.6.0, xdist-3.5.0, anyio-4.2.0
8 workers [502 items]   
.......................................................................................................................................................................................................................................................................................................................................................................................................................................................................... [ 91%]
......................................s.....                                                                                                                                                                                                                                                                                                                                                                                                                               [100%]
================================================================================================================================================================================================================================ warnings summary ================================================================================================================================================================================================================================
tests/pyathena/arrow/test_async_cursor.py: 10 warnings
tests/pyathena/arrow/test_cursor.py: 11 warnings
  /Users/foobar/github/PyAthena/pyathena/arrow/result_set.py:265: FutureWarning: Passing 'use_legacy_dataset' is deprecated as of pyarrow 15.0.0 and will be removed in a future version.
    dataset = parquet.ParquetDataset(

tests/pyathena/pandas/test_async_cursor.py: 40 warnings
tests/pyathena/pandas/test_cursor.py: 39 warnings
  /Users/foobar/Library/Application Support/hatch/env/pip-compile/pyathena/DhQ5A1u5/pyathena/lib/python3.11/site-packages/pandas/io/parquet.py:274: FutureWarning: Passing 'use_legacy_dataset' is deprecated as of pyarrow 15.0.0 and will be removed in a future version.
    pa_table = self.api.parquet.read_table(

tests/pyathena/pandas/test_async_cursor.py: 40 warnings
tests/pyathena/pandas/test_cursor.py: 39 warnings
  /Users/foobar/github/PyAthena/pyathena/pandas/result_set.py:355: FutureWarning: Passing 'use_legacy_dataset' is deprecated as of pyarrow 15.0.0 and will be removed in a future version.
    dataset = parquet.ParquetDataset(

tests/pyathena/pandas/test_async_cursor.py::TestAsyncPandasCursor::test_float_na_values[async_pandas_cursor1-auto]
tests/pyathena/pandas/test_async_cursor.py::TestAsyncPandasCursor::test_float_na_values[async_pandas_cursor2-pyarrow]
tests/pyathena/pandas/test_async_cursor.py::TestAsyncPandasCursor::test_float_na_values[async_pandas_cursor3-fastparquet]
tests/pyathena/pandas/test_async_cursor.py::TestAsyncPandasCursor::test_float_na_values[async_pandas_cursor0-auto]
  /Users/foobar/github/PyAthena/tests/pyathena/pandas/test_async_cursor.py:532: FutureWarning: Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `ser.iloc[pos]`
    rows = [tuple([row[0]]) for _, row in df.iterrows()]

tests/pyathena/pandas/test_cursor.py::TestPandasCursor::test_complex[pandas_cursor2-1000000]
tests/pyathena/pandas/test_cursor.py::TestPandasCursor::test_complex[pandas_cursor0-None]
tests/pyathena/pandas/test_cursor.py::TestPandasCursor::test_complex_as_pandas[pandas_cursor0-None]
tests/pyathena/pandas/test_cursor.py::TestPandasCursor::test_complex[pandas_cursor1-1000]
tests/pyathena/pandas/test_cursor.py::TestPandasCursor::test_complex_as_pandas[pandas_cursor1-1000]
tests/pyathena/pandas/test_cursor.py::TestPandasCursor::test_complex_as_pandas[pandas_cursor2-1000000]
  /Users/foobar/github/PyAthena/pyathena/pandas/result_set.py:214: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '[datetime.time(0, 0)]' has dtype incompatible with datetime64[ns], please explicitly cast to a compatible dtype first.
    df.loc[:, times] = df.loc[:, times].apply(lambda r: r.dt.time)

tests/pyathena/pandas/test_cursor.py::TestPandasCursor::test_float_na_values[pandas_cursor0-auto]
tests/pyathena/pandas/test_cursor.py::TestPandasCursor::test_float_na_values[pandas_cursor1-auto]
tests/pyathena/pandas/test_cursor.py::TestPandasCursor::test_float_na_values[pandas_cursor2-pyarrow]
tests/pyathena/pandas/test_cursor.py::TestPandasCursor::test_float_na_values[pandas_cursor3-fastparquet]
  /Users/foobar/github/PyAthena/tests/pyathena/pandas/test_cursor.py:1050: FutureWarning: Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `ser.iloc[pos]`
    rows = [tuple([row[0]]) for _, row in df.iterrows()]

tests/pyathena/sqlalchemy/test_base.py: 65 warnings
  /Users/foobar/github/PyAthena/tests/pyathena/conftest.py:96: SADeprecationWarning: The dbapi() classmethod on dialect classes has been renamed to import_dbapi().  Implement an import_dbapi() classmethod directly on class <class 'pyathena.sqlalchemy.rest.AthenaRestDialect'> to remove this warning; the old .dbapi() classmethod may be maintained for backwards compatibility.
    return sqlalchemy.engine.create_engine(

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html

---------- coverage: platform darwin, python 3.11.7-final-0 ----------
Name                                  Stmts   Miss  Cover
---------------------------------------------------------
pyathena/__init__.py                     43      9    79%
pyathena/arrow/__init__.py                0      0   100%
pyathena/arrow/async_cursor.py           42      0   100%
pyathena/arrow/converter.py              32      2    94%
pyathena/arrow/cursor.py                 90      1    99%
pyathena/arrow/result_set.py            148     14    91%
pyathena/arrow/util.py                   45      3    93%
pyathena/async_cursor.py                 54      1    98%
pyathena/common.py                      291     47    84%
pyathena/connection.py                  127     32    75%
pyathena/converter.py                    81      9    89%
pyathena/cursor.py                       72      1    99%
pyathena/error.py                        21      0   100%
pyathena/fastparquet/__init__.py          0      0   100%
pyathena/fastparquet/util.py             44      3    93%
pyathena/filesystem/__init__.py           0      0   100%
pyathena/filesystem/s3.py               279     74    73%
pyathena/filesystem/s3_object.py         34      0   100%
pyathena/formatter.py                    99      5    95%
pyathena/model.py                       490     13    97%
pyathena/pandas/__init__.py               3      0   100%
pyathena/pandas/async_cursor.py          44      0   100%
pyathena/pandas/converter.py             23      0   100%
pyathena/pandas/cursor.py                97      1    99%
pyathena/pandas/result_set.py           230     26    89%
pyathena/pandas/util.py                 155      6    96%
pyathena/result_set.py                  523     97    81%
pyathena/spark/__init__.py                0      0   100%
pyathena/spark/async_cursor.py           34      5    85%
pyathena/spark/common.py                188     48    74%
pyathena/spark/cursor.py                 32      2    94%
pyathena/sqlalchemy/__init__.py           0      0   100%
pyathena/sqlalchemy/arrow.py             15     15     0%
pyathena/sqlalchemy/base.py             526     71    87%
pyathena/sqlalchemy/pandas.py            19     19     0%
pyathena/sqlalchemy/requirements.py      98     98     0%
pyathena/sqlalchemy/rest.py               4      0   100%
pyathena/sqlalchemy/types.py             31      9    71%
pyathena/sqlalchemy/util.py               3      1    67%
pyathena/util.py                         31      1    97%
---------------------------------------------------------
TOTAL                                  4048    613    85%
Coverage HTML written to dir htmlcov

============================================================================================================================================================================================================ 501 passed, 1 skipped, 258 warnings in 333.78s (0:05:33) ============================================================================================================================================================================================================
jasonamyers commented 3 months ago

@laughingman7743 thank you so much for all your help with this, also the new sphinx docs are awesome.

laughingman7743 commented 3 months ago

I will squash and merge, is that ok? Or do you squash the commits yourself? You don't have to squash to one commit. Just merge it into the commit you need.

jasonamyers commented 3 months ago

@laughingman7743 feel free to squash and merge as you see fit.

laughingman7743 commented 3 months ago

============================================================================================================================================================================================================================== test session starts ===============================================================================================================================================================================================================================
platform darwin -- Python 3.11.7, pytest-8.0.0, pluggy-1.4.0
rootdir: /Users/foobar/github/PyAthena
configfile: pyproject.toml
plugins: cov-4.1.0, dependency-0.6.0, xdist-3.5.0, anyio-4.2.0
8 workers [502 items]   
.......................................................................................................................................................................................................................................................................................................................................................................................................................................................................... [ 91%]
......................................s.....                                                                                                                                                                                                                                                                                                                                                                                                                               [100%]
================================================================================================================================================================================================================================ warnings summary ================================================================================================================================================================================================================================
tests/pyathena/arrow/test_async_cursor.py: 10 warnings
tests/pyathena/arrow/test_cursor.py: 11 warnings
  /Users/foobar/github/PyAthena/pyathena/arrow/result_set.py:265: FutureWarning: Passing 'use_legacy_dataset' is deprecated as of pyarrow 15.0.0 and will be removed in a future version.
    dataset = parquet.ParquetDataset(

tests/pyathena/pandas/test_async_cursor.py: 40 warnings
tests/pyathena/pandas/test_cursor.py: 39 warnings
  /Users/foobar/github/PyAthena/pyathena/pandas/result_set.py:354: FutureWarning: Passing 'use_legacy_dataset' is deprecated as of pyarrow 15.0.0 and will be removed in a future version.
    dataset = parquet.ParquetDataset(

tests/pyathena/pandas/test_cursor.py::TestPandasCursor::test_complex[pandas_cursor2-1000000]
tests/pyathena/pandas/test_cursor.py::TestPandasCursor::test_complex_as_pandas[pandas_cursor0-None]
tests/pyathena/pandas/test_cursor.py::TestPandasCursor::test_complex[pandas_cursor0-None]
tests/pyathena/pandas/test_cursor.py::TestPandasCursor::test_complex_as_pandas[pandas_cursor1-1000]
tests/pyathena/pandas/test_cursor.py::TestPandasCursor::test_complex[pandas_cursor1-1000]
tests/pyathena/pandas/test_cursor.py::TestPandasCursor::test_complex_as_pandas[pandas_cursor2-1000000]
  /Users/foobar/github/PyAthena/pyathena/pandas/result_set.py:214: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '[datetime.time(0, 0)]' has dtype incompatible with datetime64[ns], please explicitly cast to a compatible dtype first.
    df.loc[:, times] = df.loc[:, times].apply(lambda r: r.dt.time)

tests/pyathena/sqlalchemy/test_base.py: 65 warnings
  /Users/foobar/github/PyAthena/tests/pyathena/conftest.py:96: SADeprecationWarning: The dbapi() classmethod on dialect classes has been renamed to import_dbapi().  Implement an import_dbapi() classmethod directly on class <class 'pyathena.sqlalchemy.rest.AthenaRestDialect'> to remove this warning; the old .dbapi() classmethod may be maintained for backwards compatibility.
    return sqlalchemy.engine.create_engine(

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html

---------- coverage: platform darwin, python 3.11.7-final-0 ----------
Name                                  Stmts   Miss  Cover
---------------------------------------------------------
pyathena/__init__.py                     43      9    79%
pyathena/arrow/__init__.py                0      0   100%
pyathena/arrow/async_cursor.py           42      0   100%
pyathena/arrow/converter.py              32      2    94%
pyathena/arrow/cursor.py                 90      1    99%
pyathena/arrow/result_set.py            148     14    91%
pyathena/arrow/util.py                   45      3    93%
pyathena/async_cursor.py                 54      1    98%
pyathena/common.py                      291     47    84%
pyathena/connection.py                  127     32    75%
pyathena/converter.py                    81      9    89%
pyathena/cursor.py                       72      1    99%
pyathena/error.py                        21      0   100%
pyathena/fastparquet/__init__.py          0      0   100%
pyathena/fastparquet/util.py             44      3    93%
pyathena/filesystem/__init__.py           0      0   100%
pyathena/filesystem/s3.py               279     74    73%
pyathena/filesystem/s3_object.py         34      0   100%
pyathena/formatter.py                    99      5    95%
pyathena/model.py                       490     13    97%
pyathena/pandas/__init__.py               3      0   100%
pyathena/pandas/async_cursor.py          44      0   100%
pyathena/pandas/converter.py             23      0   100%
pyathena/pandas/cursor.py                97      1    99%
pyathena/pandas/result_set.py           230     26    89%
pyathena/pandas/util.py                 155      6    96%
pyathena/result_set.py                  523     97    81%
pyathena/spark/__init__.py                0      0   100%
pyathena/spark/async_cursor.py           34      5    85%
pyathena/spark/common.py                188     48    74%
pyathena/spark/cursor.py                 32      2    94%
pyathena/sqlalchemy/__init__.py           0      0   100%
pyathena/sqlalchemy/arrow.py             15     15     0%
pyathena/sqlalchemy/base.py             526     71    87%
pyathena/sqlalchemy/pandas.py            19     19     0%
pyathena/sqlalchemy/requirements.py      98     98     0%
pyathena/sqlalchemy/rest.py               4      0   100%
pyathena/sqlalchemy/types.py             31      9    71%
pyathena/sqlalchemy/util.py               3      1    67%
pyathena/util.py                         31      1    97%
---------------------------------------------------------
TOTAL                                  4048    613    85%
Coverage HTML written to dir htmlcov

============================================================================================================================================================================================================ 501 passed, 1 skipped, 171 warnings in 353.93s (0:05:53) ============================================================================================================================================================================================================