databricks / databricks-sql-go

Golang database/sql driver for Databricks SQL.
Apache License 2.0
37 stars 41 forks source link

PUT Staging files fails: RESOURCE_DOES_NOT_EXIST Command does not exist #176

Closed mdibaiee closed 8 months ago

mdibaiee commented 1 year ago

Hello, we are using this driver for writing a connector for Estuary Flow. The method we want to use for landing data in Databricks is staging files on a volume in a catalog, through SQL commands. In this case, we are trying to upload a local file written by the connector using PUT ‘/path/to/local/file.csv’ INTO ‘/Volumes/catalog/schema/flow_staging/file.csv’

However, this exec command fails with the following two errors:

databricks: execution error: failed to execute query: unexpected operation state ERROR_STATE: PERMISSION_DENIED: Presigned URLs API is not enabled

My understanding is that this error stems from the initial sending of the PUT command as-is to databricks. Since the local file is not known to Databricks, I do expect some sort of error here, but I’m not sure if this error is directly explainable by my understanding of it.

We also have this other error:

databricks: driver error: error performing staging operation: databricks: request error: get result set metadata request error: Post \"[https://REDACTED.cloud.databricks.com:443/sql/1.0/warehouses/REDACTED\](https://REDACTED.cloud.databricks.com/sql/1.0/warehouses/REDACTED/)": RESOURCE_DOES_NOT_EXIST: Command 01ee6e90-ef33-1f1f-8f75-0c6192d6953c does not exist.: request error after 1 attempt(s): unexpected HTTP status 404 Not Found

This is the error that stems from the driver attempting to stage the file, and I believe this is the actual culprit of the process. What is puzzling me is how come the driver is not able to fetch metadata about the first request it sent in order to verify whether this request was indeed a file staging request. The relevant code for this error is here: https://github.com/databricks/databricks-sql-go/blob/714e2643455127e45df6e93ec8c8df903e40794f/connection.go#L555-L562 I’m starting to think this might be a bug in either the driver or the interaction between the driver and the server: the driver wants to fetch metadata to see if the previous request was for staging a file, but the server does not recognise the queryId as a valid one and claims it does not exist.

Note that, I have tested this by directly using examples/staging.go and I get the same error, so that file can be considered the minimal reproducible example.

Any help to understand the issue and resolve it is appreciated!

yunbodeng-db commented 1 year ago

@nithinkdb, i think the server version needs to be 14.2 or 14.1.x - the next maintenance build. Can you clarify?

mdibaiee commented 1 year ago

@yunbodeng-db how can I check the server version, and how can I change it?

mdibaiee commented 1 year ago

For the time being we have switched to using databricks go sdk, and the FilesAPI interface to upload our files: https://pkg.go.dev/github.com/databricks/databricks-sdk-go

https://pkg.go.dev/github.com/databricks/databricks-sdk-go@v0.24.0/service/files#FilesAPI.Upload

yunbodeng-db commented 1 year ago

Is your workspace in Azure or AWS? I think it should work on AWS. I am checking in the status of Azure workspace.

Yunbo

On Thu, Nov 2, 2023 at 4:45 AM Mahdi Dibaiee @.***> wrote:

For the time being we have switched to using databricks go sdk, and the FilesAPI interface to upload our files: https://pkg.go.dev/github.com/databricks/databricks-sdk-go

@.***/service/files#FilesAPI.Upload

— Reply to this email directly, view it on GitHub https://github.com/databricks/databricks-sql-go/issues/176#issuecomment-1790572483, or unsubscribe https://github.com/notifications/unsubscribe-auth/AY7BOD2RL5QANRBYZ7UZX23YCOBUTAVCNFSM6AAAAAA6PHDVU6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOJQGU3TENBYGM . You are receiving this because you were mentioned.Message ID: @.***>

mdibaiee commented 1 year ago

@yunbodeng-db ours is on AWS

yunbodeng-db commented 1 year ago

13.2+ should support it. You can check out our website.

On Fri, Nov 3, 2023, 6:31 AM Mahdi Dibaiee @.***> wrote:

@yunbodeng-db https://github.com/yunbodeng-db ours is on AWS

— Reply to this email directly, view it on GitHub https://github.com/databricks/databricks-sql-go/issues/176#issuecomment-1792441860, or unsubscribe https://github.com/notifications/unsubscribe-auth/AY7BOD4OCJXKN2L64ZKNYO3YCTW2XAVCNFSM6AAAAAA6PHDVU6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOJSGQ2DCOBWGA . You are receiving this because you were mentioned.Message ID: @.***>

yunbodeng-db commented 1 year ago

https://docs.databricks.com/en/_extras/documents/best-practices-ingestion-partner-volumes.pdf I think you need to request to add the workspace to a whitelist.

yunbodeng-db commented 1 year ago

In a notebook or run the query via the driver "select current_version()". I asked the team, looks like you need to talk to your rep to get your account whitelisted even if you are on AWS.

mdibaiee commented 1 year ago

@yunbodeng-db since we want to run these queries on our customers' workspaces, we don't want to use features that are behind whitelists / need to be manually enabled ideally. So far our usage of the Files API for uploading files is working fine, so we are not going to use the PUT interface.

zuckerberg-db commented 9 months ago

@mdibaiee this issue should be resolved with today's GA release of UC Volumes. Can you verify?

mdibaiee commented 8 months ago

@zuckerberg-db Hi, we tried switching to PUT today and have found that there are PUT queries intermittently just hang forever, never resolving.

We initially were able to upload some files using PUT successfully, as seen here:

image

But on subsequent runs of the same code, we now see:

image

Note that the query started on 14:20, and the time of the screenshot is 14:28, so the query has been running for 7-8 minutes with no results.

yunbodeng-db commented 8 months ago

Can you work with your Databricks rep to file a support report? We would be asking workspace info and the complete statement IDs. Unlikely this is a driver issue since the queries on the server were stuck but we can help find out the root cause.

mdibaiee commented 8 months ago

Update on this: The root cause of the hangs was found to be a bug in Databricks file uploads and is planned to be fixed in the next release.

Until then, a workaround we use is providing the OVERWRITE=TRUE option in the PUT statement, which bypasses the code that has the bug.