Open moltar opened 10 months ago
Thanks for raising this and providing a link back to the suspected code @moltar
Three questions:
packages-install-path
within dbt_project.yml
?I got the same error as you when I did the following in a zsh
shell in macOS:
mkdir -p .cache/dbt_packages
ln -s .cache/dbt_packages dbt_packages
dbt_packages
as the packages-install-path
within dbt_project.yml
packages-install-path: dbt_packages
Install dependencies
dbt deps
File "/Users/dbeatty/projects/environments/postgres_1.7/lib/python3.10/site-packages/dbt/task/deps.py", line 228, in run
system.rmtree(self.project.packages_install_path)
File "/Users/dbeatty/projects/environments/postgres_1.7/lib/python3.10/site-packages/dbt/clients/system.py", line 570, in rmtree
return shutil.rmtree(path, onerror=chmod_and_retry)
File "/Users/dbeatty/.pyenv/versions/3.10.10/lib/python3.10/shutil.py", line 737, in rmtree
onerror(os.path.islink, path, sys.exc_info())
File "/Users/dbeatty/.pyenv/versions/3.10.10/lib/python3.10/shutil.py", line 735, in rmtree
raise OSError("Cannot call rmtree on a symbolic link")
OSError: Cannot call rmtree on a symbolic link
But it worked for me when I did the following instead:
.cache/dbt_packages
as the packages-install-path
within dbt_project.yml
packages-install-path: .cache/dbt_packages
How is the symlink being created?
I am not sure, how, or whether that is even a symlink. It's a directory, provided by CodeBuild, which will be cached across runs. How they provision this directory is unknown (to me).
As part of the build spec, i specify the cache dir:
cache: {
paths: [
// cache dbt packages
"/root/.cache/dbt_packages",
],
},
Are you using the packages-install-path within dbt_project.yml?
Yes, set via env:
packages-install-path: "{{ env_var('DBT_PACKAGES_INSTALL_PATH', 'dbt_packages') }}"
Does the "Possible workaround" below work for you?
So you are suggesting, essentially, add another directory layer, so that when cleaning it out, we do not try to clean the top level link, and only unlink whatever is inside the dir?
I mean I think it would work, don't see a reason not. For now, the workaround was just to disable caching, but I will try your workaround.
To answer the linking question (partially), this is from the CodeBuild log:
[Container] 2023/12/19 15:21:43.569893 Moving to directory /codebuild/output/src2019796648/src
[Container] 2023/12/19 15:21:43.571322 Expanded cache path /root/.cache
[Container] 2023/12/19 15:21:50.655824 MkdirAll: /codebuild/local-cache/custom/ebc372a1c9f0ee32803d1ef5dc06a690f02a9133f92ecab5a21fa9c4bf851f2b/root/.cache
[Container] 2023/12/19 15:21:50.656107 Symlinking: /root/.cache => /codebuild/local-cache/custom/ebc372a1c9f0ee32803d1ef5dc06a690f02a9133f92ecab5a21fa9c4bf851f2b/root/.cache
Still does not give us the full answer as to how it does the symlinking, but at least the dir structure is clear.
And that workaround does work, btw. Thanks! 🎉
And that workaround does work, btw. Thanks! 🎉
Did the way that you implemented the workaround successfully use the installs that are cached across runs? Or did it just skip the cached portion in favor of creating a new local directory named .cache/dbt_packages
?
Because CodeBuild does not guarantee caching, if using local, I think it's hard to tell. It's "best effort" if you get lucky and get placed on the same machine ;)
One way to solve the case when the packages-install-path
is a symlink is to re-use the approach from here.
i.e., replace the code here with this instead:
dest_path = self.project.packages_install_path
if system.path_exists(dest_path):
if system.path_is_symlink(dest_path):
system.remove_file(dest_path)
else:
system.rmdir(dest_path)
Is this a new bug in dbt-core?
Current Behavior
Running
dbt deps
in AWS CodeBuild results in this error:On this line: https://github.com/dbt-labs/dbt-core/blame/2401600e57048dd56818f7293abed96ffd510ac9/core/dbt/task/deps.py#L231
Notable observation is that we use CodeBuild caching for dbt packages.
Expected Behavior
Not fail.
Steps To Reproduce
/root/.cache/dbt_packages
)dbt deps
Relevant log output
Environment
Which database adapter are you using with dbt?
No response
Additional Context
Using cached location for dbt packages.
I've tried narrowing it down, and the issue starts happening on
1.7.0
release.