databricks / databricks-sdk-py

Databricks SDK for Python (Beta)
https://databricks-sdk-py.readthedocs.io/
Apache License 2.0
318 stars 103 forks source link

[ISSUE] w.dbutils.fs.ls fails when using databricks SDK inside another workspace #664

Closed mfilan closed 1 month ago

mfilan commented 1 month ago

Description After token authorisation with WorkspaceClient, w.clusters.list() correctly lists the clusters of the connected workspace, whereas w.dbutils.fs.ls shows current workspace's files.

Reproduction

from databricks.sdk import WorkspaceClient

w = WorkspaceClient(
  host  = HOST,
  token = TOKEN
)
for c in w.clusters.list():
  print(c.cluster_name) # works correct
w.dbutils.fs.ls('FileStore/') #lists files from current workspace instead of connected workspace

Expected behavior w.dbutils.fs.ls() method correctly shows files present in the connected workspace.

Other Information

mfilan commented 1 month ago

Nevermind, solved it with monkey patching:

from databricks.sdk import WorkspaceClient, dbutils as sdk_dbutils
import databricks.sdk.core as client
w = WorkspaceClient(
  host  = HOST,
  token = TOKEN
)
def _make_dbutils(config: client.Config):
    return sdk_dbutils.RemoteDbUtils(config)
w._dbutils = _make_dbutils(w._config)
for c in w.clusters.list():
  print(c.cluster_name)
w.dbutils.fs.ls('/FileStore/')