Open leo-schick opened 11 months ago
Hey @leo-schick! We're currently working on an effort to improve the performance of dbt deps
- namely, providing a way to only install the changed/new packages on a subsequent dbt deps
. Relevant issue here.
What you're proposing, however, would improve the performance of the initial dbt deps
- I don't think this piece will be a high priority for us currently, but definitely an enhancement we could tackle in the future!
Hey @graciegoheen This is great to hear! I think on local installations #6643 will be of great help. However, in environments where instances are rebuild every job run (e.g. in Databricks), I think this ticket is a possible way to increase the speed there.
I do not have so deep knowledge about the dbt code base. Maybe it is possible to run dbt deps
inside the same class which runs the models in parallel. This would help to reduce duplicated code.
Is this your first time submitting a feature request?
Describe the feature
Running
dbt deps
is quite slow in my huge project. It runs for 1.73 minutes:I have in total 18 packages which I import: 2 pages from the dbt hub and 16 from a git repository on GitHub using the git notation
git: "git@github.com:user/repo.git"
. I think this process should be improved in speed. For example, by retrieving the repositories in parallel instead of in a single thread.Describe alternatives you've considered
No response
Who will this benefit?
Everybody which imports more than one package.
Are you interested in contributing this feature?
No response
Anything else?
No response