laughingman7743 / PyAthena

PyAthena is a Python DB API 2.0 (PEP 249) client for Amazon Athena.
MIT License
461 stars 105 forks source link

Python 3.10: No module named 'distutils.util' #463

Closed alecswan closed 1 year ago

alecswan commented 1 year ago

With Python 3.10, the following code throws "ModuleNotFoundError: No module named 'distutils.util'" on pyathena/connection.py:13 line "from distutils.util import strtobool".

It appears that distutils is deprecated in Python 3.10 and will be permanently removed in Python 3.12.

import pandas as pd
from pyathena import connect
import boto3

session = boto3.session.Session(profile_name='YOUR-AWS-PROFILE')
conn = connect(
    s3_staging_dir=athena_output_location,
    work_group=athena_workgroup,
    session=session
)

df = pd.read_sql_query("SELECT schema_name FROM information_schema.schemata limit 10;", conn)
print(df.head())
laughingman7743 commented 1 year ago

What OS are you using? If ubuntu or debian, you can install python3-distutils with apt.

$ apt-get install python3-distutils
laughingman7743 commented 1 year ago

https://github.com/pypa/distutils/blob/main/distutils/util.py#L340-L353 Since distutils is deprecated, it is better to port this method.

alecswan commented 1 year ago

@laughingman7743 , thank you for the quick fix! FYI, I am using Ubuntu and distutils had already been installed before I filed the ticket. Before your fix was in, I simply switched to a Python 3.9 environment as a workaround.

However, I am curious why Python 3.10 tests have been passing in pyathena Github ci/cd all along?

laughingman7743 commented 1 year ago

I use Ubuntu for testing with GitHubActions, and Python 3.10 seems to work fine. It's curious. 🤔

alecswan commented 1 year ago

I believe python 3.10.12 is where dateutils is being deprecated. Which minor version is being used in the builds?

laughingman7743 commented 1 year ago

https://github.com/laughingman7743/PyAthena/actions/runs/5565185738/jobs/10165349235#step:4:39

alecswan commented 1 year ago

Thanks. I wonder if this could be due to me using Ubuntu on WSL2 and somehow that makes dateutils work from Python 3.9 but not from Python 3.10.12. Is it worth looking into the following, i.e. poetry is installed with Python 3.10.6?https://github.com/laughingman7743/PyAthena/actions/runs/5565185738/jobs/10165349235#step:3:13

laughingman7743 commented 1 year ago

It has been fixed in 3.0.6, so you don't have to worry about it anymore. https://pypi.org/project/pyathena/3.0.6/ As a result, I could also stop using deprecated modules. Thank you.