databricks / databricks-sdk-py

Databricks SDK for Python (Beta)
https://databricks-sdk-py.readthedocs.io/
Apache License 2.0
352 stars 117 forks source link

[ISSUE] InvalidParameterValue: Path must be absolute: \mnt #660

Open ridvansg opened 4 months ago

ridvansg commented 4 months ago

Description If I try to check if a dbfs folder exists on databricks instance, with code like client.dbfs.exists('/mnt'), I get error "InvalidParameterValue: Path must be absolute: \mnt".

The issue happens on windows only.

Reproduction

  1. On Windows start ipython
  2. use following code: import logging import argparse from databricks.sdk import WorkspaceClient from databricks.sdk.service.compute import DataSecurityMode, RuntimeEngine, Library from datetime import timedelta import os client = WorkspaceClient() client.dbfs.exists('/mnt')
  3. Error message: InvalidParameterValue: Path must be absolute: \mnt

Expected behavior "true" or "false", depending on whether the path exists on databricks.

Is it a regression? I didn't try.

Debug Logs 'Cell In[11], line 1 ----> 1 client.dbfs.exists('/mnt')

File C:\ProgramData\anaconda3\envs\databricks2\lib\site-packages\databricks\sdk\mixins\files.py:572, in DbfsExt.exists(self, path) 570 """If file exists on DBFS""" 571 p = self._path(path) --> 572 return p.exists()

File C:\ProgramData\anaconda3\envs\databricks2\lib\site-packages\databricks\sdk\mixins\files.py:490, in _DbfsPath.exists(self) 488 def exists(self) -> bool: 489 try: --> 490 self._api.get_status(self.as_string) 491 return True 492 except NotFound:

File C:\ProgramData\anaconda3\envs\databricks2\lib\site-packages\databricks\sdk\service\files.py:624, in DbfsAPI.get_status(self, path) 621 if path is not None: query['path'] = path 622 headers = {'Accept': 'application/json', } --> 624 res = self._api.do('GET', '/api/2.0/dbfs/get-status', query=query, headers=headers) 625 return FileInfo.from_dict(res)

File C:\ProgramData\anaconda3\envs\databricks2\lib\site-packages\databricks\sdk\core.py:132, in ApiClient.do(self, method, path, query, headers, body, raw, files, data, response_headers) 128 headers['User-Agent'] = self._user_agent_base 129 retryable = retried(timeout=timedelta(seconds=self._retry_timeout_seconds), 130 is_retryable=self._is_retryable, 131 clock=self._cfg.clock) --> 132 response = retryable(self._perform)(method, 133 path, 134 query=query, 135 headers=headers, 136 body=body, 137 raw=raw, 138 files=files, 139 data=data) 141 resp = dict() 142 for header in response_headers if response_headers else []:

File C:\ProgramData\anaconda3\envs\databricks2\lib\site-packages\databricks\sdk\retries.py:54, in retried..decorator..wrapper(*args, **kwargs) 50 retry_reason = f'{type(err).name} is allowed to retry' 52 if retry_reason is None: 53 # raise if exception is not retryable ---> 54 raise err 56 logger.debug(f'Retrying: {retry_reason} (sleeping ~{sleep}s)') 57 clock.sleep(sleep + random())

File C:\ProgramData\anaconda3\envs\databricks2\lib\site-packages\databricks\sdk\retries.py:33, in retried..decorator..wrapper(*args, *kwargs) 31 while clock.time() < deadline: 32 try: ---> 33 return func(args, **kwargs) 34 except Exception as err: 35 last_err = err

File C:\ProgramData\anaconda3\envs\databricks2\lib\site-packages\databricks\sdk\core.py:243, in ApiClient._perform(self, method, path, query, headers, body, raw, files, data) 239 if not response.ok: # internally calls response.raise_for_status() 240 # TODO: experiment with traceback pruning for better readability 241 # See https://stackoverflow.com/a/58821552/277035 242 payload = response.json() --> 243 raise self._make_nicer_error(response=response, **payload) from None 244 # Private link failures happen via a redirect to the login page. From a requests-perspective, the request 245 # is successful, but the response is not what we expect. We need to handle this case separately. 246 if _is_private_link_redirect(response):

InvalidParameterValue: Path must be absolute: \mnt'

Other Information

Additional context If I manually add the following row in Lib\site-packages\databricks\sdk\core.py , the code returns correct 'true' value: def do(self, method: str, path: str, query: dict = None, headers: dict = None, body: dict = None, raw: bool = False, files=None, data=None, response_headers: List[str] = None) -> Union[dict, BinaryIO]: logger.warning(f"AAA path:{path}; method:{method}; query:{query}; headers:{headers}; body:{body}; raw:{raw}; files:{files}; data:{data}; response_headers:{response_headers}") if 'path' in query: query['path'] = query['path'].replace('\\', '/')