Open michaelschuett-tomtom opened 3 weeks ago
This is resolved with https://github.com/databricks/cli/pull/1708
Would be awesome to get this upstreamed. I have started to build some bazel rules around this so you can start doing stuff like this.
databricks(
name = "deploy_dev",
outs = [],
args = [
"bundle",
"deploy",
"--target", "dev"
],
required_vars = [
"docker_client_id",
"docker_client_secret",
]
srcs = [
":databricks_files",
":some_notebook",
":internal_wheel",
],
)
However it feels a little strange to open source this and have it pointing to my internal builds of the databricks CLI.
@michaelschuett-tomtom what is the reason you use symlinks in your bundle? I assume to link the content outside of bundle root? If so, in the latest CLI (0.227.0) we have a new functionality sync.paths
which allows you to sync files outside of bundle root
The goal here is mainly to make databricks play nicely with the bazel build system. The current problem is when bazel builds it's sandbox for commands to run inside of it creates some ugly path under /private/var/tmp/_bazel_username/${commit hash}/...
and symlinks in dependent files which may be outputs from other bazel rules or just static files in your repo.
Here is an example of what the directory might look like.
ls -lah
total 0
drwxr-xr-x@ 10 schuettm wheel 320B Aug 27 13:25 .
drwxr-xr-x@ 4 schuettm wheel 128B Aug 27 13:25 ..
lrwxr-xr-x@ 1 schuettm wheel 168B Aug 22 11:38 create_tasks -> /private/var/tmp/_bazel_schuettm/7f7b24c2ab40dffefa912dc1d5931ddc/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/workflows/databricks/somepath/create_tasks
lrwxr-xr-x@ 1 schuettm wheel 87B Aug 22 11:38 databricks.yml -> /Users/schuettm/Code/repo/workflows/databricks/somepath/databricks.yml
lrwxr-xr-x@ 1 schuettm wheel 169B Aug 22 11:38 deploy_dev.sh -> /private/var/tmp/_bazel_schuettm/7f7b24c2ab40dffefa912dc1d5931ddc/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/workflows/databricks/somepath/deploy_dev.sh
drwxr-xr-x@ 3 schuettm wheel 96B Aug 22 11:38 fixtures
drwxr-xr-x@ 3 schuettm wheel 96B Aug 27 09:14 resources
lrwxr-xr-x@ 1 schuettm wheel 167B Aug 22 11:38 select -> /private/var/tmp/_bazel_schuettm/7f7b24c2ab40dffefa912dc1d5931ddc/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/workflows/databricks/somepath/select
drwxr-xr-x@ 5 schuettm wheel 160B Aug 22 21:51 src
drwxr-xr-x@ 3 schuettm wheel 96B Aug 27 10:41 whls
The title of this could likely be "I want to make databricks asset bundles work with bazel". The sync.paths
you mentioned does sound like it has some but not complete overlap with what I am trying to achieve inside bazel. As the initial reason for porting the databricks command to a bazel rule was so I could have wheel files that the bundle depends on be built and inserted into the asset bundle with one command thus greatly improving out current workflow of build and publish a new package then update the notebook to the newly created version.
Just a bump on this to try and keep it from becoming stale since I currently have the time to work or modify the linked PR provided upstream is willing to accept it.
@michaelschuett-tomtom Thanks for posting the issue and including the rationale.
It's great to hear you're looking to make the CLI work well with Bazel.
There are a couple of reasons why we ignore symlinks:
This doesn't help in building working Bazel rules, of course. But if you really only care about locally unrolling the symlink tree that Bazel builds, an alternative could be to run rsync
with -L
(or --copy-links
) to create a symlink-free copy of the tree in a temporary directory, and then running the CLI. Would that work?
Describe the issue
I have a symlink in my directory and it is silently ignored when uploading to the
files
folder. The databricks.yml files andresources
that is loads are symlinks as well however it is able to read them. I can't find any docs about this but I likely missed something. Any reason that symlinks are not supported.Configuration
Create a symlink run
databricks bundle deploy
and see that it is missing.Steps to reproduce the behavior
Please list the steps required to reproduce the issue, for example:
databricks bundle deploy ...
databricks bundle run ...
Expected Behavior
A warning is output or better yet it just uploads the file.
Actual Behavior
It is silently ignored.
OS and CLI version
mac OS, all versions
Is this a regression?
no