aws / aws-sdk-pandas

pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
https://aws-sdk-pandas.readthedocs.io
Apache License 2.0
3.93k stars 701 forks source link

aws wrangler preprocessing regex strings in query causing query to error #1141

Open jconwell opened 2 years ago

jconwell commented 2 years ago

Describe the bug

Query regex expressions with a double backslash will cause the query to error when executed by awswrangler.athena.read_sql_query(). But the same query run in the Athena console will execute successfully .

The regex in question is: ((?:[a-zA-Z0-9\-_\\]+\.)+[a-zA-Z0-9\-_\\]+)

I changed the double backslash to triple backslash and it did not cause the query to error in aws wrangler.

So I ran a query plan in both aws wrangler and Athena console and there was one difference.

The query plan run in the Athena console: regexp_extract := "regexp_extract"("end_point", CAST('((?:[a-zA-Z0-9\-_\\\]+\.)+[a-zA-Z0-9\-_\\\]+)' AS joniregexp), BIGINT '1')

The query plan run in aws wrangler: regexp_extract := "regexp_extract"("end_point", CAST('((?:[a-zA-Z0-9\-_\\]+\.)+[a-zA-Z0-9\-_\\]+)' AS joniregexp), BIGINT '1')

It looks like the execution path when using aws wrangler is passing the regex string through a regex engine to process escape characters, and then passing the regex string to Athena which processes escape characters again, resulting in a single backslash followed by a closing bracket which isn't valid.

Environment

Name Version Build Channel

_anaconda_depends 2021.11 py38_0 _ipyw_jlab_nb_ext_conf 0.1.0 py38_0 _libgcc_mutex 0.1 main _openmp_mutex 4.5 1_gnu abseil-cpp 20210324.2 h9c3ff4c_0 conda-forge alabaster 0.7.12 pyhd3eb1b0_0 anaconda custom py38_1 anaconda-client 1.9.0 py38h06a4308_0 anaconda-navigator 2.1.1 py38_0 anaconda-project 0.10.2 pyhd3eb1b0_0 ansi2html 1.5.2 py38h06a4308_0 anyio 2.2.0 py38h06a4308_1 appdirs 1.4.4 pyhd3eb1b0_0 argh 0.26.2 py38_0 argon2-cffi 20.1.0 py38h27cfd23_1 arrow 0.13.1 py38_0 arrow-cpp 3.0.0 py38h6b21186_4 asn1crypto 1.4.0 py_0 astroid 2.6.6 py38h06a4308_0 astropy 5.0 py38h09021b7_0 async_generator 1.10 pyhd3eb1b0_0 atomicwrites 1.4.0 py_0 attrs 21.2.0 pyhd3eb1b0_0 autopep8 1.5.7 pyhd3eb1b0_0 aws-c-common 0.4.57 he6710b0_1 aws-c-event-stream 0.1.6 h2531618_5 aws-checksums 0.1.9 he6710b0_0 aws-sdk-cpp 1.8.185 hce553d0_0 awswrangler 2.11.0 pyhd8ed1ab_0 conda-forge babel 2.9.1 pyhd3eb1b0_0 backcall 0.2.0 pyhd3eb1b0_0 backports 1.0 pyhd3eb1b0_2 backports.functools_lru_cache 1.6.4 pyhd3eb1b0_0 backports.shutil_get_terminal_size 1.0.0 pyhd3eb1b0_3 backports.tempfile 1.0 pyhd3eb1b0_1 backports.weakref 1.0.post1 py_1 beautifulsoup4 4.10.0 pyh06a4308_0 binaryornot 0.4.4 pyhd3eb1b0_1 bitarray 2.3.0 py38h7f8727e_1 bkcharts 0.2 py38_0 black 19.10b0 py_0 blas 1.0 mkl bleach 4.0.0 pyhd3eb1b0_0 blosc 1.21.0 h8c45485_0 bokeh 2.4.2 py38h06a4308_0 boost-cpp 1.69.0 h11c811c_1000 conda-forge boto 2.49.0 py38_0 boto3 1.20.24 pyhd8ed1ab_0 conda-forge botocore 1.23.24 pyhd8ed1ab_0 conda-forge bottleneck 1.3.2 py38heb32a55_1 brotli 1.0.9 he6710b0_2 brotli-python 1.0.9 py38heb0550a_2 brotlipy 0.7.0 py38h27cfd23_1003 brunsli 0.1 h2531618_0 bzip2 1.0.8 h7b6447c_0 c-ares 1.17.1 h27cfd23_0 ca-certificates 2021.10.26 h06a4308_2 cairo 1.14.12 h8948797_3 anaconda certifi 2021.10.8 py38h06a4308_0 cffi 1.14.6 py38h400218f_0 cfitsio 3.470 hf0d0db6_6 chardet 4.0.0 py38h06a4308_1003 charls 2.2.0 h2531618_0 charset-normalizer 2.0.4 pyhd3eb1b0_0 click 8.0.3 pyhd3eb1b0_0 cloudpickle 2.0.0 pyhd3eb1b0_0 clyent 1.2.2 py38_1 colorama 0.4.4 pyhd3eb1b0_0 conda 4.11.0 py38h06a4308_0 conda-build 3.21.7 py38h06a4308_0 conda-content-trust 0.1.1 pyhd3eb1b0_0 conda-env 2.6.0 1 conda-pack 0.6.0 pyhd3eb1b0_0 conda-package-handling 1.7.3 py38h27cfd23_1 conda-repo-cli 1.0.4 pyhd3eb1b0_0 conda-token 0.3.0 pyhd3eb1b0_0 conda-verify 3.4.2 py_1 contextlib2 0.6.0.post1 pyhd3eb1b0_0 cookiecutter 1.7.2 pyhd3eb1b0_0 cryptography 3.4.8 py38hd23ed53_0 curl 7.78.0 h1ccaba5_0 cycler 0.11.0 pyhd3eb1b0_0 cython 0.29.24 py38hdbfa776_0 cytoolz 0.11.0 py38h7b6447c_0 daal4py 2021.4.0 py38h78b71dc_0 dal 2021.4.0 h06a4308_729 dash 1.19.0 pyhd3eb1b0_0 dash-core-components 1.3.1 py_0 dash-html-components 1.0.1 py_0 dash-renderer 1.1.2 py_0 dash-table 4.4.1 pyhd3eb1b0_0 dask 2021.10.0 pyhd3eb1b0_0 dask-core 2021.10.0 pyhd3eb1b0_0 dataclasses 0.8 pyh6d0b6a4_7 dbus 1.13.18 hb2f20db_0 debugpy 1.5.1 py38h295c915_0 decorator 5.1.0 pyhd3eb1b0_0 defusedxml 0.7.1 pyhd3eb1b0_0 diff-match-patch 20200713 pyhd3eb1b0_0 distributed 2021.10.0 py38h06a4308_0 docutils 0.17.1 py38h06a4308_1 double-conversion 3.1.5 he6710b0_1 entrypoints 0.3 py38_0 et_xmlfile 1.1.0 py38h06a4308_0 expat 2.4.1 h2531618_2 fastcache 1.1.0 py38h7b6447c_0 filelock 3.4.0 pyhd3eb1b0_0 flake8 3.9.2 pyhd3eb1b0_0 flask 1.1.2 pyhd3eb1b0_0 flask-compress 1.10.1 pyhd3eb1b0_0 fontconfig 2.13.1 h6c09931_0 fonttools 4.25.0 pyhd3eb1b0_0 freetype 2.11.0 h70c0345_0 fribidi 1.0.10 h7b6447c_0 fsspec 2021.10.1 pyhd3eb1b0_0 future 0.18.2 py38_1 geoip2 4.0.2 py_0 conda-forge get_terminal_size 1.0.0 haa9412d_0 gevent 21.8.0 py38h7f8727e_1 gflags 2.2.2 he1b5a44_1004 conda-forge giflib 5.2.1 h7b6447c_0 glib 2.56.2 hd408876_0 anaconda glob2 0.7 pyhd3eb1b0_0 glog 0.5.0 h48cff8f_0 conda-forge gmp 6.2.1 h2531618_2 gmpy2 2.0.8 py38hd5f6e3b_3 graphite2 1.3.14 h23475e2_0 graphviz 2.40.1 h21bd128_2 anaconda greenlet 1.1.1 py38h295c915_0 grpc-cpp 1.39.0 hae934f6_5 gst-plugins-base 1.14.0 hbbd80ab_1 anaconda gstreamer 1.14.0 hb453b48_1 anaconda h5py 2.10.0 py38h7918eee_0 harfbuzz 1.8.8 hffaf4a1_0 anaconda hdf5 1.10.4 hb1b8bf9_0 heapdict 1.0.1 pyhd3eb1b0_0 html5lib 1.1 pyhd3eb1b0_0 icu 58.2 he6710b0_3 idna 3.3 pyhd3eb1b0_0 imagecodecs 2021.8.26 py38h4cda21f_0 imageio 2.9.0 pyhd3eb1b0_0 imagesize 1.3.0 pyhd3eb1b0_0 importlib-metadata 4.8.2 py38h06a4308_0 importlib_metadata 4.8.2 hd3eb1b0_0 inflection 0.5.1 py38h06a4308_0 iniconfig 1.1.1 pyhd3eb1b0_0 intel-openmp 2021.4.0 h06a4308_3561 intervaltree 3.1.0 pyhd3eb1b0_0 ipykernel 6.4.1 py38h06a4308_1 ipython 7.29.0 py38hb070fc8_0 ipython_genutils 0.2.0 pyhd3eb1b0_1 ipywidgets 7.6.5 pyhd3eb1b0_1 isort 5.9.3 pyhd3eb1b0_0 itsdangerous 2.0.1 pyhd3eb1b0_0 jbig 2.1 hdba287a_0 jdcal 1.4.1 pyhd3eb1b0_0 jedi 0.18.0 py38h06a4308_1 jeepney 0.7.1 pyhd3eb1b0_0 jinja2 2.11.3 pyhd3eb1b0_0 jinja2-time 0.2.0 pyhd3eb1b0_2 jmespath 0.10.0 pyh9f0ad1d_0 conda-forge joblib 1.1.0 pyhd3eb1b0_0 jpeg 9d h7f8727e_0 json5 0.9.6 pyhd3eb1b0_0 jsonschema 3.2.0 pyhd3eb1b0_2 jupyter 1.0.0 py38_7 jupyter-dash 0.4.0 pyhd8ed1ab_0 conda-forge jupyter_client 6.1.12 pyhd3eb1b0_0 jupyter_console 6.4.0 pyhd3eb1b0_0 jupyter_core 4.9.1 py38h06a4308_0 jupyter_server 1.4.1 py38h06a4308_0 jupyterlab 3.2.1 pyhd3eb1b0_1 jupyterlab_pygments 0.1.2 py_0 jupyterlab_server 2.8.2 pyhd3eb1b0_0 jupyterlab_widgets 1.0.0 pyhd3eb1b0_1 jxrlib 1.1 h7b6447c_2 keyring 23.4.0 py38h06a4308_0 kiwisolver 1.3.1 py38h2531618_0 krb5 1.19.2 hac12032_0 lazy-object-proxy 1.6.0 py38h27cfd23_0 lcms2 2.12 h3be6417_0 ld_impl_linux-64 2.35.1 h7274673_9 lerc 3.0 h295c915_0 libaec 1.0.4 he6710b0_1 libarchive 3.4.2 h62408e4_0 libboost 1.73.0 h3ff78a5_11 libcurl 7.78.0 h0b77cf5_0 libdeflate 1.8 h7f8727e_5 libedit 3.1.20210910 h7f8727e_0 libev 4.33 h7f8727e_1 libevent 2.1.10 hcdb4288_3 conda-forge libffi 3.3 he6710b0_2 libgcc-ng 9.3.0 h5101ec6_17 libgfortran-ng 7.5.0 ha8ba4b0_17 libgfortran4 7.5.0 ha8ba4b0_17 libgomp 9.3.0 h5101ec6_17 liblief 0.10.1 he6710b0_0 libllvm11 11.1.0 h3826bc1_0 libnghttp2 1.46.0 hce63b2e_0 libpng 1.6.37 hbc83047_0 libprotobuf 3.17.2 h4ff587b_1 libsodium 1.0.18 h7b6447c_0 libspatialindex 1.9.3 h2531618_0 libssh2 1.9.0 h1ba5d50_1 libstdcxx-ng 9.3.0 hd4cf53a_17 libthrift 0.14.2 hcc01f38_0 libtiff 4.2.0 h85742a9_0 libtool 2.4.6 h7b6447c_1005 libuuid 1.0.3 h7f8727e_2 libuv 1.40.0 h7b6447c_0 libwebp 1.2.0 h89dd481_0 libwebp-base 1.2.0 h27cfd23_0 libxcb 1.14 h7b6447c_0 libxml2 2.9.12 h03d6c58_0 libxslt 1.1.34 hc22bd24_0 libzopfli 1.0.3 he6710b0_0 llvmlite 0.37.0 py38h295c915_1 locket 0.2.1 py38h06a4308_1 lxml 4.6.3 py38h9120a33_0 lz4-c 1.9.3 h295c915_1 lzo 2.10 h7b6447c_2 markupsafe 1.1.1 py38h7b6447c_0 matplotlib 3.5.0 py38h06a4308_0 matplotlib-base 3.5.0 py38h3ed280b_0 matplotlib-inline 0.1.2 pyhd3eb1b0_2 mccabe 0.6.1 py38_1 mistune 0.8.4 py38h7b6447c_1000 mkl 2021.4.0 h06a4308_640 mkl-service 2.4.0 py38h7f8727e_0 mkl_fft 1.3.1 py38hd3c417c_0 mkl_random 1.2.2 py38h51133e4_0 mock 4.0.3 pyhd3eb1b0_0 more-itertools 8.12.0 pyhd3eb1b0_0 mpc 1.1.0 h10f8cd9_1 mpfr 4.0.2 hb69a4c5_1 mpi 1.0 mpich mpich 3.3.2 hc856adb_0 mpmath 1.2.1 py38h06a4308_0 msgpack-python 1.0.2 py38hff7bd54_1 multipledispatch 0.6.0 py38_0 munkres 1.1.4 py_0 mypy_extensions 0.4.3 py38_0 navigator-updater 0.2.1 py38_0 nbclassic 0.2.6 pyhd3eb1b0_0 nbclient 0.5.3 pyhd3eb1b0_0 nbconvert 6.1.0 py38h06a4308_0 nbformat 5.1.3 pyhd3eb1b0_0 ncurses 6.3 h7f8727e_2 nest-asyncio 1.5.1 pyhd3eb1b0_0 networkx 2.6.3 pyhd3eb1b0_0 nltk 3.6.5 pyhd3eb1b0_0 nose 1.3.7 pyhd3eb1b0_1006 notebook 6.4.6 py38h06a4308_0 numba 0.54.1 py38h51133e4_0 numexpr 2.7.3 py38h22e1b3c_1 numpy 1.20.3 py38hf144106_0 numpy-base 1.20.3 py38h74d4b33_0 numpydoc 1.1.0 pyhd3eb1b0_1 olefile 0.46 pyhd3eb1b0_0 openjpeg 2.4.0 h3ad879b_0 openpyxl 3.0.9 pyhd3eb1b0_0 openssl 1.1.1l h7f8727e_0 orc 1.6.9 ha97a36c_3 packaging 21.3 pyhd3eb1b0_0 pandas 1.3.4 py38h8c16a72_0 pandocfilters 1.4.3 py38h06a4308_1 pango 1.42.4 h049681c_0 anaconda parso 0.8.2 pyhd3eb1b0_0 partd 1.2.0 pyhd3eb1b0_0 patchelf 0.13 h295c915_0 path 16.0.0 py38h06a4308_0 path.py 12.5.0 hd3eb1b0_0 pathlib2 2.3.6 py38h06a4308_2 pathspec 0.7.0 py_0 patsy 0.5.2 py38h06a4308_0 pcre 8.45 h295c915_0 pep8 1.7.1 py38_0 pexpect 4.8.0 pyhd3eb1b0_3 pg8000 1.21.3 pyhd8ed1ab_0 conda-forge pickleshare 0.7.5 pyhd3eb1b0_1003 pillow 8.4.0 py38h5aabda8_0 pip 21.2.4 py38h06a4308_0 pixman 0.40.0 h7f8727e_1 pkginfo 1.7.1 py38h06a4308_0 plotly 5.1.0 pyhd3eb1b0_0 pluggy 1.0.0 py38h06a4308_0 ply 3.11 py38_0 poyo 0.5.0 pyhd3eb1b0_0 prometheus_client 0.12.0 pyhd3eb1b0_0 prompt-toolkit 3.0.20 pyhd3eb1b0_0 prompt_toolkit 3.0.20 hd3eb1b0_0 psutil 5.8.0 py38h27cfd23_1 ptyprocess 0.7.0 pyhd3eb1b0_2 py 1.10.0 pyhd3eb1b0_0 py-lief 0.10.1 py38h403a769_0 py-radix 0.10.0 pypi_0 pypi pyarrow 3.0.0 py38he0739d4_3 pycodestyle 2.7.0 pyhd3eb1b0_0 pycosat 0.6.3 py38h7b6447c_1 pycparser 2.21 pyhd3eb1b0_0 pycurl 7.44.1 py38h8f2d780_1 pydocstyle 6.1.1 pyhd3eb1b0_0 pyerfa 2.0.0 py38h27cfd23_0 pyflakes 2.3.1 pyhd3eb1b0_0 pygments 2.10.0 pyhd3eb1b0_0 pyjwt 2.1.0 py38h06a4308_0 pylint 2.9.6 py38h06a4308_1 pyls-spyder 0.4.0 pyhd3eb1b0_0 pymysql 1.0.2 pyhd8ed1ab_0 conda-forge pyodbc 4.0.31 py38h295c915_0 pyopenssl 21.0.0 pyhd3eb1b0_1 pyparsing 3.0.4 pyhd3eb1b0_0 pyqt 5.9.2 py38h05f1152_4 pyrsistent 0.18.0 py38heee7806_0 pysocks 1.7.1 py38h06a4308_0 pytables 3.6.1 py38h9fd0a39_0 pytest 6.2.5 py38h06a4308_2 python 3.8.12 h12debd9_0 python-dateutil 2.8.2 pyhd3eb1b0_0 python-graphviz 0.16 pyhd3eb1b0_1 python-libarchive-c 2.9 pyhd3eb1b0_1 python-lsp-black 1.0.0 pyhd3eb1b0_0 python-lsp-jsonrpc 1.0.0 pyhd3eb1b0_0 python-lsp-server 1.2.4 pyhd3eb1b0_0 python-slugify 5.0.2 pyhd3eb1b0_0 python_abi 3.8 2_cp38 conda-forge pytz 2021.3 pyhd3eb1b0_0 pywavelets 1.1.1 py38h7b6447c_2 pyxdg 0.27 pyhd3eb1b0_0 pyyaml 6.0 py38h7f8727e_1 pyzmq 22.3.0 py38h295c915_2 qdarkstyle 3.0.2 pyhd3eb1b0_0 qstylizer 0.1.10 pyhd3eb1b0_0 qt 5.9.7 h5867ecd_1 qtawesome 1.0.3 pyhd3eb1b0_0 qtconsole 5.1.1 pyhd3eb1b0_0 qtpy 1.10.0 pyhd3eb1b0_0 re2 2021.08.01 h9c3ff4c_0 conda-forge readline 8.1 h27cfd23_0 redshift_connector 2.0.901 pyhd8ed1ab_0 conda-forge regex 2021.8.3 py38h7f8727e_0 requests 2.26.0 pyhd3eb1b0_0 requests-file 1.5.1 pyhd3eb1b0_0 retrying 1.3.3 pyhd3eb1b0_2 ripgrep 12.1.1 0 rope 0.21.1 pyhd3eb1b0_0 rtree 0.9.7 py38h06a4308_1 ruamel_yaml 0.15.100 py38h27cfd23_0 s3transfer 0.5.0 pyhd8ed1ab_0 conda-forge scikit-image 0.18.3 py38h51133e4_0 scikit-learn 1.0.1 py38h51133e4_0 scikit-learn-intelex 2021.4.0 py38h06a4308_0 scipy 1.7.1 py38h292c36d_2 scramp 1.4.1 pyhd8ed1ab_0 conda-forge seaborn 0.11.2 pyhd3eb1b0_0 secretstorage 3.3.1 py38h06a4308_0 send2trash 1.8.0 pyhd3eb1b0_1 setuptools 58.0.4 py38h06a4308_0 simplegeneric 0.8.1 py38_2 singledispatch 3.7.0 pyhd3eb1b0_1001 sip 4.19.13 py38he6710b0_0 six 1.15.0 py38h06a4308_0 snappy 1.1.8 he6710b0_0 sniffio 1.2.0 py38h06a4308_1 snowballstemmer 2.2.0 pyhd3eb1b0_0 sortedcollections 2.1.0 pyhd3eb1b0_0 sortedcontainers 2.4.0 pyhd3eb1b0_0 soupsieve 2.3.1 pyhd3eb1b0_0 sphinx 4.2.0 pyhd3eb1b0_1 sphinxcontrib 1.0 py38_1 sphinxcontrib-applehelp 1.0.2 pyhd3eb1b0_0 sphinxcontrib-devhelp 1.0.2 pyhd3eb1b0_0 sphinxcontrib-htmlhelp 2.0.0 pyhd3eb1b0_0 sphinxcontrib-jsmath 1.0.1 pyhd3eb1b0_0 sphinxcontrib-qthelp 1.0.3 pyhd3eb1b0_0 sphinxcontrib-serializinghtml 1.1.5 pyhd3eb1b0_0 sphinxcontrib-websupport 1.2.4 py_0 spyder 5.1.5 py38h06a4308_1 spyder-kernels 2.1.3 py38h06a4308_0 sqlalchemy 1.4.27 py38h7f8727e_0 sqlite 3.36.0 hc218d9a_0 statsmodels 0.12.2 py38h27cfd23_0 sympy 1.9 py38h06a4308_0 tbb 2021.4.0 hd09550d_0 tbb4py 2021.4.0 py38hd09550d_0 tblib 1.7.0 pyhd3eb1b0_0 tenacity 8.0.1 py38h06a4308_0 terminado 0.9.4 py38h06a4308_0 testpath 0.5.0 pyhd3eb1b0_0 text-unidecode 1.3 pyhd3eb1b0_0 textdistance 4.2.1 pyhd3eb1b0_0 threadpoolctl 2.2.0 pyh0d69192_0 three-merge 0.1.1 pyhd3eb1b0_0 tifffile 2021.7.2 pyhd3eb1b0_2 tinycss 0.4 pyhd3eb1b0_1002 tk 8.6.11 h1ccaba5_0 tldextract 3.1.2 pyhd8ed1ab_0 conda-forge toml 0.10.2 pyhd3eb1b0_0 toolz 0.11.2 pyhd3eb1b0_0 tornado 6.1 py38h27cfd23_0 tqdm 4.62.3 pyhd3eb1b0_1 traitlets 5.1.1 pyhd3eb1b0_0 typed-ast 1.4.3 py38h7f8727e_1 typing_extensions 3.10.0.2 pyh06a4308_0 ujson 4.0.2 py38h2531618_0 unicodecsv 0.14.1 py38_0 unidecode 1.2.0 pyhd3eb1b0_0 unixodbc 2.3.9 h7b6447c_0 uriparser 0.9.3 he6710b0_1 urllib3 1.26.7 pyhd3eb1b0_0 utf8proc 2.6.1 h27cfd23_0 watchdog 2.1.6 py38h06a4308_0 wcwidth 0.2.5 pyhd3eb1b0_0 webencodings 0.5.1 py38_1 werkzeug 2.0.2 pyhd3eb1b0_0 wheel 0.37.0 pyhd3eb1b0_1 whichcraft 0.6.1 pyhd3eb1b0_0 widgetsnbextension 3.5.1 py38_0 wrapt 1.12.1 py38h7b6447c_1 wurlitzer 3.0.2 py38h06a4308_0 xlrd 2.0.1 pyhd3eb1b0_0 xlsxwriter 3.0.2 pyhd3eb1b0_0 xlwt 1.3.0 py38_0 xmltodict 0.12.0 pyhd3eb1b0_0 xz 5.2.5 h7b6447c_0 yaml 0.2.5 h7b6447c_0 yapf 0.31.0 pyhd3eb1b0_0 zeromq 4.3.4 h2531618_0 zfp 0.5.5 h2531618_6 zict 2.0.0 pyhd3eb1b0_0 zipp 3.6.0 pyhd3eb1b0_0 zlib 1.2.11 h7b6447c_3 zope 1.0 py38_1 zope.event 4.5.0 py38_0 zope.interface 5.4.0 py38h7f8727e_0 zstd 1.4.9 haebb681_0

To Reproduce

This query will reproduce the error if you run it in aws wrangler, but it will not error in Athena console

sql = """
SELECT 
    regexp_extract(the_column, '((?:[a-zA-Z0-9\-_\\]+\.)+[a-zA-Z0-9\-_\\]+)', 1) as stuff
FROM my_table
"""
df = wr.athena.read_sql_query(
    sql, 
    database="the_database", 
    workgroup="the_group",
    ctas_approach=False
)
df
kukushking commented 2 years ago

Looks like query is escaped twice. Thanks, we'll look into this.

github-actions[bot] commented 2 years ago

This issue requires triage and should be assigned.

github-actions[bot] commented 2 years ago

Marking this issue as stale due to inactivity. This helps our maintainers find and focus on the active issues. If this issue receives no comments in the next 7 days it will automatically be closed.

jconwell commented 2 years ago

It's not really stale. The bug is still a problem for us.