Closed andrewnester closed 2 months ago
@GeroSalas could you please describe what's the failure is and what error do you get?
@andrewnester sure!
Basically I defined an env var ${var.signol_lib_package_version}
with the value signol_lib-0.4.4-20240822+prod-py3-none-any.whl
so I reference it dinamically in the YAML of the job tasks where I require it, like below:
libraries:
- whl: ${workspace.root_path}/files/dist/${var.signol_lib_package_version}
pyproject.toml
signol-lib = {path = "dist/signol_lib-0.4.4-20240822+prod-py3-none-any.whl"}
All the setup was working fine. Job tasks were executing fine and the compute was installing the right libs, but suddenly it stopped working, so when I went into the cicd logs now I see these 2 new lines appeared in my AWS codebuild logs:
Building signol_lib...
Uploading signol_lib-0.4.4-py3-none-any.whl...
And definietly yes, the signol_lib package was overwritten again because this recent change introduced picked up but referenced the wrong whl full name and reupload it incorrectly so my tasks cannot find the correct one anymore.
@GeroSalas Just to confirm few things:
artifacts
section in bundle configuration defined for this library?@andrewnester
v0.226.0
@GeroSalas can you please share the full bundle YAML configuration? I can't seems to reproduce the error so far, so something else might be missing.
Also, do you have setup.py
file in your bundle root directory?
@andrewnester
I'm experiencing a similar issue as @GeroSalas where all our integration tests cannot be validated,deployed nor run. Everything was working fine in v0.226.0 but since updating to v0.227.0 it stopped working and gives a Error: Python wheel tasks require compute with DBR 13.3+ to include local libraries. Please change your cluster configuration or use the experimental 'python_wheel_wrapper' setting. See https://docs.databricks.com/dev-tools/bundles/python-wheel.html for more information.
for me. Rolling back to 0.226.0 for now, as that still works. (Thus, error handling also seems incomplete/incorrect as the DBR has nothing to do with this)
We do have a setup.py
in the bundle's root directory for another purpose, but are running the DAB using a direct link with GIT. The branch the DAB is based on, is passed on using an variable.
Relevant code snippets:
Databricks.yml
bundle:
name: wise
include:
- resources/*.yml
variables:
integration_branch:
description: The source branch of the PR for the integration test
default: development
targets:
integration:
mode: production
git:
branch: ${var.integration_branch}
workspa<host>
root_path: /Shared/${bundle.name}/${bundle.target}
run_as:
user_name: <user>
azure-pipelines.yml
resources:
containers:
- container: pycontainer
image: databricksruntime/standard:10.4-LTS
steps:
- script: |
cd ./wise_bundle/wise
export DATABRICKS_HOST=$(URL)
export DATABRICKS_TOKEN=$(PAT)
echo $(System.PullRequest.TargetBranch)
if [ "$(System.PullRequest.TargetBranch)" == "refs/heads/development" ]
then
echo "Deploying bundles to the integration environment"
# Get the source branch
branch=$(System.PullRequest.SourceBranch)
echo $branch
branch_name=${branch#refs/heads/}
echo $branch_name
# Deploy integration workflow using the source branch
databricks bundle validate -t integration
databricks bundle deploy -t integration --force-lock --var="integration_branch=$branch_name"
databricks bundle run -t integration <flow_name> --no-wait
fi
workingDirectory: $(Build.SourcesDirectory)
target: pycontainer
displayName: "Run integration test"
@mfleuren do you have a libraries
or environments
section in your DABs config files where you reference any libraries? Could you share this section?
@andrewnester we do have libraries defined for individual tasks, f.i.
- task_key: deploy_model_on_dsia
depends_on:
- task_key: daily_inference
notebook_task:
notebook_path: wise_bundle/wise/src/deployment/deploy_api
source: GIT
job_cluster_key: wise_etl_cluster
max_retries: 1
min_retry_interval_millis: 60000
libraries:
- pypi:
package: ${var.msal-package}
- pypi:
package: ${var.requests-package}
Where the packages are defined centrally, in this case:
requests-package:
description: PyPi package
default: requests==2.25.1
msal-package:
description: PyPi package
default: msal==1.28.0
All defined packages are publicly available on Pypi.
Thanks for the details! Indeed, it's a bug on our side which is fixed in this PR https://github.com/databricks/cli/pull/1717 It will be released in the next CLI release
@andrewnester awesome, thanks!
Changes
Improves detection of PyPi package names in environment dependencies
Tests
Added unit tests