databrickslabs / cicd-templates

Manage your Databricks deployments and CI with code.
Other
201 stars 100 forks source link

com.databricks.backend.daemon.data.common.InvalidMountException #23

Closed brett-matthews closed 4 years ago

brett-matthews commented 4 years ago

When Databricks runs the Job on the cluster we get the following error;

Message: Library installation failed for library due to infra fault for whl: "dbfs:/databricks/mlflow-tracking/x/x/artifacts/dist/pipeline-0.1.0-py3-none-any.whl" . Error messages: java.lang.RuntimeException: ManagedLibraryInstallFailed: com.google.common.util.concurrent.UncheckedExecutionException: com.databricks.backend.daemon.data.common.InvalidMountException: Error while using path /databricks/mlflow-tracking/x/x/artifacts/dist/pipeline-0.1.0-py3-none-any.whl for resolving path '/x/x/artifacts/dist/pipeline-0.1.0-py3-none-any.whl' within mount at '/databricks/mlflow-tracking'. for library:PythonWhlId(dbfs:/databricks/mlflow-tracking/x/x/artifacts/dist/pipeline-0.1.0-py3-none-any.whl,,NONE),isSharedLibrary=false

Is this anything to do with the below?

https://kb.databricks.com/machine-learning/mlflow-artifacts-no-client-error.html#invalid-mount-exception

Is this commit related to the issue because we pulled but it was still failing

https://github.com/databrickslabs/cicd-templates/commit/4536390dabe8549f5ccaf3df73636e09d57ee89a

brett-matthews commented 4 years ago

Changed the deployment.py file to copy the project whl file from mlflow and put it on an available path on DBFS - hacked this together (in def log_artifacts());

` if libraries:

                    import base64

                    with open('dist/{}'.format(file), 'rb') as dist_file_p:
                        encoded_string = base64.b64encode(dist_file_p.read())

                    payload = {
                        'path': '/tmp/{}'.format(file),
                        'contents':  encoded_string.decode('utf-8')
                    }

                    apiClient = getDatabricksAPIClient()
                    apiClient.perform_query(
                        method='POST', path='/dbfs/put', data=payload)

                    libraries.append({ext[1:]: 'dbfs:/tmp/{}'.format(file)})
                    # libraries.append({ext[1:]: dist_file})

`

Which gets further, but then the pipeline_runner files are being fetched from "mlflow-tracking" path too, e.g.;

Message: Cannot read the python file dbfs:/databricks/mlflow-tracking/x/x/artifacts/job/pipelines/pipeline2/pipeline_runner.py. Please check driver logs for more details.

brett-matthews commented 4 years ago

Had to delete the existing Experiment in Databricks after commit https://github.com/databrickslabs/cicd-templates/commit/4536390dabe8549f5ccaf3df73636e09d57ee89a and seemed to work from there.