databricks / cli

Databricks CLI
Other
115 stars 40 forks source link

Databricks Bundle Deploy fails from within Databricks for databricks notebooks. #1263

Open vitoravancini opened 3 months ago

vitoravancini commented 3 months ago

Describe the issue

I'm trying to deploy a bundle asset from a notebook in a Databricks workspace. The company I'm working at has a policy that every databricks development has to be run from Databricks, so using local IDEs or terminals are not an option for me.

Resources pointing to Python files and notebooks created outside Databricks(i.e. .ipynb files) seem to work. When I'm pointing to a Databricks notebook the command

databricks bundle deploy 

fails with the following error:

image

Configuration

Using the standard python template config and importing the project to Databricks reproduces the error

Steps to reproduce the behavior

Please list the steps required to reproduce the issue, for example:

  1. Run databricks bundle init default-python
  2. Push it to a repo and add it to Databricks repos
  3. Create a notebook 'runner' at the root of the repo with the following content:

` !curl -fsSL https://raw.githubusercontent.com/databricks/setup-cli/main/install.sh | sh import os myToken = dbutils.notebook.entry_point.getDbutils().notebook().getContext().apiToken().getOrElse(None) os.environ['DATABRICKS_HOST'] = "" os.environ['DATABRICKS_TOKEN'] = myToken

!/root/bin/databricks bundle deploy `

  1. and see the error

image

Expected Behavior

Deploy should have succeded

Actual Behavior

image

OS and CLI version

Databricks CLI v0.215.0

Databricks Runtime Version: 14.3 LTS ML (includes Apache Spark 3.5.0, Scala 2.12)

Debug Logs

image

Some extra info, Databricks notebooks seem to always have 0 bytes:

image

I've also commited the repo here: https://github.com/vitoravancini/dab_issue_example

To use it one would have to change databricks host and user on the yaml dab files.

pedroportelas commented 3 months ago

The company I'm working at has a policy that every databricks development has to be run from Databricks, so using local IDEs or terminals are not an option for me.

Same and also couldn't get past this error. Can't really use bundles until this is addressed

pietern commented 3 months ago

Thanks for reporting. We're aware this is not possible today.

The underlying issue is that notebooks specifically cannot be read through the FUSE mount of the workspace filesystem and we don't have a workaround for this yet. If you avoid notebooks and only use Python files this should work. This will be fixed but cannot commit to a timeline yet.

theresaham-db commented 3 months ago

Hi @vitoravancini and @pedroportelas,

It would be great to hear in more detail about the problem and your use cases since we are working on improving the developer experience. If you are happy to set up a call to share more feedback with us please reach out on theresa . hammer_at_databricks . com (please remove spaces). Thanks!

vitoravancini commented 3 months ago

Will do!

This next week Ill try to expand on my PoC here using bundles and I figure I will have more opinions and doubts by the end of the week.

Some time later this month would be a good moment to set up a call, thank you!

theresaham-db commented 3 months ago

Perfect, thank you!

hsechier commented 1 month ago

Hello, With the new version of DBR > 15.0, I can deploy with bundle into databricks by opening the terminal :

image

Only error that I have is when I add a notebook (with .py extension of databricks) there is an issue (operation not supported). You can see the first deploy that I did. When I remove my notebook with .py extension it's working (second deploy).

Strange because my python script are also with .py extension and it's working....

pietern commented 1 month ago

Thanks for the ping on this thread!

We're currently working on fixing this. You can find relevant work (in case you're interested) in https://github.com/databricks/cli/pull/1452 and https://github.com/databricks/cli/pull/1457. In those PRs we're preparing for not going through the local filesystem mount (under /Workspace) but through the Workspace APIs for all filesystem interactions. That lets us virtually include extensions for notebooks as they would appear in a Git repository. It's not entirely done yet, I'll try to remember to ping this thread when everything works and is released.