xo / usql

Universal command-line interface for SQL databases
MIT License
9.08k stars 352 forks source link

USQL Databricks driver fails when connection with Azure Databricks #466

Closed calebeaires closed 5 months ago

calebeaires commented 5 months ago

I am facing an issue when trying to connect to Databricks using usql. Even when using the correct credentials, typing the command:

usql databricks://token:dapi*****@adb-*************.azuredatabricks.net:443/sql/protocolv1/o/*********/*******

The credentials and URL details have been omitted for security, but I assure you that the data is correct.

Error Logs:

CleanShot 2024-06-05 at 10 28 57@2x

What You Have Tried

Where it works

Using DBeaver and the same credentials, the connection works like a charm

Any help to resolve this issue would be highly appreciated.

kenshaw commented 5 months ago

@calebeaires this is likely due to how the URL is generated for the driver. I had based this on the reference code for AWS. Specifically, the URL is bad. Could you try opening an interactive interpreter session, and then doing this:

$ usql
(not connected)=> \c databricks token:dapi*****@adb-*************.azuredatabricks.net:443/sql/protocolv1/o/*********/*******

The issue is that the path generated from dburl is incorrect for Azure.

I will think about the best way to handle this in the future, but you should be able to connect using the direct connection syntax (see above's \c).

There's some difficultly here since Databricks' URLs do not conform to how other databases work.

Sadly, I don't have a Azure Databricks instance that I can test against, so will need your help with this. Thanks.

kenshaw commented 5 months ago

@calebeaires I have fixed defining connections in config.yaml to properly recognize YAML array syntax. You should be able to define a full connection string now in your config.yaml:

$ cat contrib/config.yaml 
---
# named connections
connections:
  zzz: ["databricks", "token:dapi*****@adb-*************.azuredatabricks.net:443/sql/protocolv1/o/*********/*******"]

You can then use the named connection zzz:

$ usql
(not connected)=> \c zzz
^C6:00AM ERR error="Post \"https://adb-*************.azuredatabricks.net:443/sql/protocolv1/o/*********/*******\": context canceled" connId= corrId= queryId=
error: databricks: request error: error connecting: host=adb-*************.azuredatabricks.net port=443, httpPath=/sql/protocolv1/o/*********/*******: databricks: request error: open session request error: Post "https://adb-*************.azuredatabricks.net:443/sql/protocolv1/o/*********/*******": context canceled

Note in the above that the URL is now azuredatabricks.net.

I've tagged a new release and will publish it as soon as the build has finished.

calebeaires commented 5 months ago

@kenshaw

I use usql into linux ubuntu, and use just the cli with the parameters. Run through a terminal is our primary goal. I can offer you and Azure Databricks connection so you can test. How can I send it to you.

kenshaw commented 5 months ago

@calebeaires Sure, I can test it directly. My email is kenshaw gmail. The latest release was published yesterday, it should work just fine though. Please check it out when you can.

kenshaw commented 5 months ago

@calebeaires Thank you for sending the credentials. I was able to connect and verify that usql will work with Databricks. You need to use this kind of syntax:

(not connected)=> \c databricks token:****@****.azuredatabricks.net:443/sql/protocolv1/o/****

Where the first **** is your personal access token, the second is the assigned domain host, and the third is the remote resource/database name.

abhishekgahlot2 commented 2 months ago

So to understand more, it doesn't parse correctly if we use usql "databricks://dsn" but works with /c dsn

I tried running via /c it works but not via usql "dsn"