apache / superset

Apache Superset is a Data Visualization and Data Exploration Platform
https://superset.apache.org/
Apache License 2.0
61.92k stars 13.57k forks source link

sqlalchemy-trino ERROR: Unexpected database format catalog/hive/ #13864

Closed b1600 closed 1 year ago

b1600 commented 3 years ago

Hello, I'm trying to create database connection from Superset to Trino.

What I've done:

  1. Run "echo "sqlalchemy-trino" >> ./docker/requirements-local.txt" from my superset root folder to add trino database driver to superset.
  2. Modify "docker-compose-non-dev.yml" to load docker/docker-bootstrap.sh as mentioned by @dungdm93 in https://github.com/apache/superset/issues/13640
  3. Run "docker-compose -f docker-compose-non-dev.yml up" to start superset docker
  4. Login to superset
  5. And then I tried to add database connection to Trino using the following connection string "trino://my_username:my_password@trino_coordinator_ip:8090/catalog/hive/"
  6. Click "Test Connection" button.

Btw, previously I've created external tables in hive that reference to delta table manifest. And Trino is used to access those hive table.

Expected results

Able to add database connection to Trino and able to query Trino from superset.

Actual results

I got the following error message "ERROR: Unexpected database format catalog/hive/"

Screenshots

How to reproduce the bug

  1. Login to superset UI
  2. Click database tab, and then click +Database button
  3. Fill in the trino connection string
  4. Click "test connection" button
  5. See error pop-up on bottom right page

Environment

(please complete the following information):

Checklist

Make sure to follow these steps before submitting your issue - thank you!

Additional context

Add any other context about the problem here.

pip freeze output of worker: aiohttp==3.7.2 alembic==1.4.3 amqp==2.6.1 '# Editable install with no version control (apache-superset==0.999.0.dev0) -e /app apispec==3.3.2 async-timeout==3.0.1 attrs==20.2.0 Babel==2.8.0 backoff==1.10.0 billiard==3.6.3.0 bleach==3.2.1 boto3==1.16.10 botocore==1.19.10 Brotli==1.0.9 cached-property==1.5.2 cachelib==0.1.1 celery==4.4.7 certifi==2020.6.20 cffi==1.14.3 chardet==3.0.4 click==7.1.2 colorama==0.4.4 contextlib2==0.6.0.post1 convertdate==2.3.0 cron-descriptor==1.2.24 croniter==0.3.36 cryptography==3.2.1 decorator==4.4.2 defusedxml==0.6.0 Deprecated==1.2.11 dnspython==2.0.0 email-validator==1.1.1 et-xmlfile==1.0.1 Flask==1.1.2 Flask-AppBuilder==3.2.1 Flask-Babel==1.0.0 Flask-Caching==1.9.0 Flask-Compress==1.8.0 Flask-Cors==3.0.9 Flask-JWT-Extended==3.24.1 Flask-Login==0.4.1 Flask-Migrate==2.5.3 Flask-OpenID==1.2.5 Flask-SQLAlchemy==2.4.4 flask-talisman==0.7.0 Flask-WTF==0.14.3 future==0.18.2 geographiclib==1.50 geopy==2.0.0 gunicorn==20.0.4 holidays==0.10.3 humanize==3.1.0 idna==2.10 ijson==3.1.2.post0 importlib-metadata==2.1.1 isodate==0.6.0 itsdangerous==1.1.0 jdcal==1.4.1 Jinja2==2.11.3 jmespath==0.10.0 jsonlines==1.2.0 jsonschema==3.2.0 kombu==4.6.11 korean-lunar-calendar==0.2.1 linear-tsv==1.1.0 Mako==1.1.3 Markdown==3.3.3 MarkupSafe==1.1.1 marshmallow==3.9.0 marshmallow-enum==1.5.1 marshmallow-sqlalchemy==0.23.1 msgpack==1.0.0 multidict==5.0.0 mysqlclient==1.4.2.post1 natsort==7.0.1 numpy==1.19.4 openpyxl==3.0.5 packaging==20.4 pandas==1.2.2 parsedatetime==2.6 pathlib2==2.3.5 pgsanity==0.2.9 Pillow==7.2.0 polyline==1.4.0 prison==0.1.3 psycopg2-binary==2.8.5 py==1.9.0 pyarrow==3.0.0 pycparser==2.20 pydruid==0.6.1 PyGithub==1.54.1 PyHive==0.6.3 PyJWT==1.7.1 PyMeeus==0.3.7 pymssql==2.1.5 pyparsing==2.4.7 pyrsistent==0.16.1 python-dateutil==2.8.1 python-dotenv==0.15.0 python-editor==1.0.4 python-geohash==0.8.5 python3-openid==3.2.0 pytz==2020.4 PyYAML==5.4.1 redis==3.5.3 requests==2.24.0 retry==0.9.2 rfc3986==1.4.0 s3transfer==0.3.3 sasl==0.2.1 selenium==3.141.0 simplejson==3.17.2 six==1.15.0 slackclient==2.5.0 SQLAlchemy==1.3.20 sqlalchemy-trino==0.2.0 SQLAlchemy-Utils==0.36.8 sqlparse==0.3.0 tableschema==1.20.0 tabulator==1.52.5 thrift==0.13.0 thrift-sasl==0.4.2 trino==0.305.0 typing-extensions==3.7.4.3 unicodecsv==0.14.1 urllib3==1.25.11 vine==1.3.0 webencodings==0.5.1 Werkzeug==1.0.1 wrapt==1.12.1 WTForms==2.3.3 WTForms-JSON==0.3.3 xlrd==1.2.0 yarl==1.6.2 zipp==3.4.0

b1600 commented 3 years ago

I was just able to connect to Trino.

I've read from https://medium.com/airbnb-engineering/supercharging-apache-superset-b1a2393278bd and tried the following connection string "presto://{trino_coordinator_ip}:{trino_coordinator_port}/hive".

Btw I've also added "pyhive" and "sqlalchemy-trino" in my "requirements-local.txt" Could someone please tell me which database driver my connection string is referring to? Is it pyhive? or sqlalchemy-trino?

Sorry if my question sounded silly. But I couldn't find connection string samples that start with "presto://" in https://superset.apache.org/docs/databases/trino/ and https://superset.apache.org/docs/databases/presto

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. For admin, please label this issue .pinned to prevent stale bot from closing the issue.

rusackas commented 1 year ago

Sounds like this is solved, but let us know if this needs revisiting!