Open johnf opened 3 months ago
Hi @johnf thanks for reaching out to the team! Could you confirm the following to help me get a sense of what might be going on:
Hi @fivetran-catfritz,
I narrowed it down to apple_store because the query I pasted above looks like the problematic one. Right now it's been running for 10 hours. If I kill the query, the load drops back down to what I saw pre upgrade.
I'm not sure if you have any internal stats on the transformations, but if you do I suspect you'll see the apple ones are running all the time on my account (johnf@gladlyapp.com).
I'll pause it now so we can confirm 100%
Hi @johnf I reviewed your run logs and noticed the issue you mentioned. Could you try running this package on its own to see if the problem persists?
We've seen similar hangups with Postgres in dbt projects involving multiple packages, often alongside minor updates—like adding a single text field in this case. More puzzling, I've also encountered this issue in another Postgres instance where it occurred in a *_tmp
model that was simply a select *
.
The issue, however, hasn't appeared consistently enough for us to determine a root cause. I suspect Postgres locks might be a factor, though that’s just a theory for now. Running the dbt_apple_store
package in isolation could help clarify if that’s the case. If you're able to access your run logs, you can see the time each model takes. Just let me know if you have any questions!
@fivetran-catfritz I can confirm that if I disable the apple_store the load stays at 10% for hours. I'll try enabling only it to see what happens. Also worth noting that when that query is running it is the only one running
I've paused all transformations in the dashboard and then commented out all other packages and run them locally.
It's currently hanging on
22:53:15 39 of 44 START sql table model dbt_apple_store.apple_store__territory_report ... [RUN]
I suspect it will sit there for hours again.
FYI, I am happy to jump on a call if someone wants to guide me on running queries manually. This is very reproducible. Also, happy for you to connect to my DB from your end directly.
I kicked it off at 9:30am
With no other queries running, it does finish eventually but takes 80 minutes.
23:29:28 39 of 44 START sql table model dbt_apple_store.apple_store__territory_report ... [RUN]
00:49:08 39 of 44 OK created sql table model dbt_apple_store.apple_store__territory_report [SELECT 130595 in 4780.70s]
00:49:08 40 of 44 START sql table model dbt_apple_store.apple_store__app_version_report . [RUN]
01:21:40 40 of 44 OK created sql table model dbt_apple_store.apple_store__app_version_report [SELECT 261984 in 1951.45s]
01:21:40 41 of 44 START sql table model dbt_apple_store.apple_store__device_report ...... [RUN]
01:22:41 41 of 44 OK created sql table model dbt_apple_store.apple_store__device_report . [SELECT 18174 in 61.07s]
01:22:41 42 of 44 START sql table model dbt_apple_store.apple_store__platform_version_report [RUN]
It isn't a super powerful instance, it is an Azure Burstable, B1ms, 1 vCores, 2 GiB RAM, 32 GiB storage
but this seems like a long time for not that much data.
Hi @johnf Thank you for running that! 80 mins does seem a bit long for the size. Before we dive deeper, could you check one more thing? If you downgrade to the prior version of dbt_apple_store and run that independently, what is the run time like? I think it would help to have a couple baselines.
@fivetran-catfritz OK good news, it's super fast on the old version, only 3 seconds.
21:50:16 39 of 44 START sql table model dbt_apple_store.apple_store__territory_report ... [RUN]
21:50:19 39 of 44 OK created sql table model dbt_apple_store.apple_store__territory_report [SELECT 133045 in 3.33s]
For reference this is the diff between the queries
--- before 2024-08-28 07:54:58.434667925 +1000
+++ after 2024-08-28 07:54:59.798667716 +1000
@@ -23,6 +23,7 @@
),
reporting_grain as (
select distinct
+ source_relation,
date_day,
app_id,
source_type,
@@ -31,6 +32,7 @@
),
joined as (
select
+ reporting_grain.source_relation,
reporting_grain.date_day,
reporting_grain.app_id,
app.app_name,
@@ -54,18 +56,22 @@
from reporting_grain
left join app
on reporting_grain.app_id = app.app_id
+ and reporting_grain.source_relation = app.source_relation
left join app_store_territory
on reporting_grain.date_day = app_store_territory.date_day
+ and reporting_grain.source_relation = app_store_territory.source_relation
and reporting_grain.app_id = app_store_territory.app_id
and reporting_grain.source_type = app_store_territory.source_type
and reporting_grain.territory = app_store_territory.territory
left join downloads_territory
on reporting_grain.date_day = downloads_territory.date_day
+ and reporting_grain.source_relation = downloads_territory.source_relation
and reporting_grain.app_id = downloads_territory.app_id
and reporting_grain.source_type = downloads_territory.source_type
and reporting_grain.territory = downloads_territory.territory
left join usage_territory
on reporting_grain.date_day = usage_territory.date_day
+ and reporting_grain.source_relation = usage_territory.source_relation
and reporting_grain.app_id = usage_territory.app_id
and reporting_grain.source_type = usage_territory.source_type
and reporting_grain.territory = usage_territory.territory
I also ran explain on both querys, but t's greek to me :grin:
FAST
Hash Left Join (cost=31584.11..37422.28 rows=83150 width=1268)
Hash Cond: (app_store_territory_1.app_id = stg_apple_store__app.app_id)
CTE app_store_territory
-> Seq Scan on stg_apple_store__app_store_territory (cost=0.00..3122.45 rows=133045 width=70)
CTE country_codes
-> Seq Scan on apple_store_country_codes (cost=0.00..6.50 rows=250 width=71)
-> Merge Right Join (cost=28427.16..31303.04 rows=20788 width=2300)
Merge Cond: ((stg_apple_store__downloads_territory.territory = app_store_territory_1.territory) AND (stg_apple_store__downloads_territory.date_day = app_store_territory_1.date_day) AND (stg_apple_store__downloads_territory.app_id = app_store_territory_1.app_id) AND (stg_apple_store__downloads_territory.source_type = app_store_territory_1.source_type))
-> Sort (cost=17939.56..18273.07 rows=133406 width=61)
Sort Key: stg_apple_store__downloads_territory.territory, stg_apple_store__downloads_territory.date_day, stg_apple_store__downloads_territory.app_id, stg_apple_store__downloads_territory.source_type
-> Seq Scan on stg_apple_store__downloads_territory (cost=0.00..2998.06 rows=133406 width=61)
-> Materialize (cost=10487.61..11488.02 rows=20788 width=2276)
-> Merge Left Join (cost=10487.61..11436.05 rows=20788 width=2276)
Merge Cond: (app_store_territory_1.territory = (alternative_country_codes.alternative_country_name)::text)
-> Merge Left Join (cost=10472.65..11067.70 rows=16630 width=1212)
Merge Cond: (app_store_territory_1.territory = (official_country_codes.country_name)::text)
-> Merge Left Join (cost=10457.69..10770.03 rows=13304 width=148)
Merge Cond: ((app_store_territory_1.territory = stg_apple_store__usage_territory.territory) AND (app_store_territory_1.date_day = stg_apple_store__usage_territory.date_day) AND (app_store_territory_1.app_id = stg_apple_store__usage_territory.app_id) AND (app_store_territory_1.source_type = stg_apple_store__usage_territory.source_type))
-> Sort (cost=9359.64..9392.90 rows=13304 width=108)
Sort Key: app_store_territory_1.territory, app_store_territory_1.date_day, app_store_territory_1.app_id, app_store_territory_1.source_type
-> Hash Right Join (cost=4390.47..8448.34 rows=13304 width=108)
Hash Cond: ((app_store_territory.date_day = app_store_territory_1.date_day) AND (app_store_territory.app_id = app_store_territory_1.app_id) AND (app_store_territory.source_type = app_store_territory_1.source_type) AND (app_store_territory.territory = app_store_territory_1.territory))
-> CTE Scan on app_store_territory (cost=0.00..2660.90 rows=133045 width=108)
-> Hash (cost=4124.39..4124.39 rows=13304 width=76)
-> HashAggregate (cost=3991.35..4124.39 rows=13304 width=76)
Group Key: app_store_territory_1.date_day, app_store_territory_1.app_id, app_store_territory_1.source_type, app_store_territory_1.territory
-> CTE Scan on app_store_territory app_store_territory_1 (cost=0.00..2660.90 rows=133045 width=76)
-> Sort (cost=1098.06..1127.26 rows=11682 width=77)
Sort Key: stg_apple_store__usage_territory.territory, stg_apple_store__usage_territory.date_day, stg_apple_store__usage_territory.app_id, stg_apple_store__usage_territory.source_type
-> Seq Scan on stg_apple_store__usage_territory (cost=0.00..308.82 rows=11682 width=77)
-> Sort (cost=14.96..15.58 rows=250 width=1580)
Sort Key: official_country_codes.country_name
-> CTE Scan on country_codes official_country_codes (cost=0.00..5.00 rows=250 width=1580)
-> Sort (cost=14.96..15.58 rows=250 width=1580)
Sort Key: alternative_country_codes.alternative_country_name
-> CTE Scan on country_codes alternative_country_codes (cost=0.00..5.00 rows=250 width=1580)
-> Hash (cost=18.00..18.00 rows=800 width=40)
-> Seq Scan on stg_apple_store__app (cost=0.00..18.00 rows=800 width=40)
(38 rows)
SLOW
Hash Left Join (cost=32314.14..36206.23 rows=20788 width=1300)
Hash Cond: (app_store_territory_1.territory = (alternative_country_codes.alternative_country_name)::text)
CTE app_store_territory
-> Seq Scan on stg_apple_store__app_store_territory (cost=0.00..3122.45 rows=133045 width=70)
CTE country_codes
-> Seq Scan on apple_store_country_codes (cost=0.00..6.50 rows=250 width=71)
-> Hash Left Join (cost=29177.07..32300.01 rows=16630 width=1300)
Hash Cond: (app_store_territory_1.territory = (official_country_codes.country_name)::text)
-> Merge Left Join (cost=29168.94..31676.58 rows=13304 width=236)
Merge Cond: ((app_store_territory_1.app_id = stg_apple_store__usage_territory.app_id) AND (app_store_territory_1.source_relation = stg_apple_store__usage_territory.source_relation))
Join Filter: ((app_store_territory_1.date_day = stg_apple_store__usage_territory.date_day) AND (app_store_territory_1.source_type = stg_apple_store__usage_territory.source_type) AND (app_store_territory_1.territory = stg_apple_store__usage_territory.territory))
-> Merge Left Join (cost=28070.88..30356.40 rows=13304 width=196)
Merge Cond: ((app_store_territory_1.app_id = stg_apple_store__downloads_territory.app_id) AND (app_store_territory_1.source_relation = stg_apple_store__downloads_territory.source_relation))
Join Filter: ((app_store_territory_1.date_day = stg_apple_store__downloads_territory.date_day) AND (app_store_territory_1.source_type = stg_apple_store__downloads_territory.source_type) AND (app_store_territory_1.territory = stg_apple_store__downloads_territory.territory))
-> Merge Left Join (cost=10131.33..10239.77 rows=13304 width=172)
Merge Cond: ((app_store_territory_1.app_id = stg_apple_store__app.app_id) AND (app_store_territory_1.source_relation = stg_apple_store__app.source_relation))
-> Sort (cost=10074.75..10108.01 rows=13304 width=140)
Sort Key: app_store_territory_1.app_id, app_store_territory_1.source_relation
-> Hash Right Join (cost=4756.34..9163.46 rows=13304 width=140)
Hash Cond: ((app_store_territory.date_day = app_store_territory_1.date_day) AND (app_store_territory.source_relation = app_store_territory_1.source_relation) AND (app_store_territory.app_id = app_store_territory_1.app_id) AND (app_store_territory.source_type = app_store_territory_1.source_type) AND (app_store_territory.territory = app_store_territory_1.territory))
-> CTE Scan on app_store_territory (cost=0.00..2660.90 rows=133045 width=140)
-> Hash (cost=4457.00..4457.00 rows=13304 width=108)
-> HashAggregate (cost=4323.96..4457.00 rows=13304 width=108)
Group Key: app_store_territory_1.source_relation, app_store_territory_1.date_day, app_store_territory_1.app_id, app_store_territory_1.source_type, app_store_territory_1.territory
-> CTE Scan on app_store_territory app_store_territory_1 (cost=0.00..2660.90 rows=133045 width=108)
-> Sort (cost=56.58..58.58 rows=800 width=72)
Sort Key: stg_apple_store__app.app_id, stg_apple_store__app.source_relation
-> Seq Scan on stg_apple_store__app (cost=0.00..18.00 rows=800 width=72)
-> Materialize (cost=17939.56..18606.59 rows=133406 width=62)
-> Sort (cost=17939.56..18273.07 rows=133406 width=62)
Sort Key: stg_apple_store__downloads_territory.app_id, stg_apple_store__downloads_territory.source_relation
-> Seq Scan on stg_apple_store__downloads_territory (cost=0.00..2998.06 rows=133406 width=62)
-> Sort (cost=1098.06..1127.26 rows=11682 width=78)
Sort Key: stg_apple_store__usage_territory.app_id, stg_apple_store__usage_territory.source_relation
-> Seq Scan on stg_apple_store__usage_territory (cost=0.00..308.82 rows=11682 width=78)
-> Hash (cost=5.00..5.00 rows=250 width=1580)
-> CTE Scan on country_codes official_country_codes (cost=0.00..5.00 rows=250 width=1580)
-> Hash (cost=5.00..5.00 rows=250 width=1580)
-> CTE Scan on country_codes alternative_country_codes (cost=0.00..5.00 rows=250 width=1580)
(39 rows)
@johnf Oh wow ok something is going on there. Thank you for confirming that. The only change from v0.3.0 to v0.4.0 is to allow users to union schemas created by multiple connectors together, which is what the source_relation
field is used for. Since you don't need it, I would downgrade in the meantime. For now you can install the below app_reporting
range and remove the reference to the app_store and google_play packages, and it will automatically install the correct downgraded versions.
- package: fivetran/app_reporting
version: [">=0.3.0", "<0.4.0"]
As for the proper fix, let me talk to my team and get some suggestions on what to explore next. Thanks again for your help with this!
Hi @johnf The next thing we'd like to do is try to narrow down what aspect of the udpates is causing the hangup. I have created a test branch that removes the source_relation
join conditions added in v0.4.0 while keeping the other updates the same. Could you run this test branch on its own and see if you are still getting the same hangups? You can install this branch by using the below code in place of the apple_store code in your packages.yml
.
- git: https://github.com/fivetran/dbt_apple_store.git
revision: test/remove-source-relation-joins
warn-unpinned: false
Let me know if you have any questions!
Hi, @fivetran-catfritz, the branch works fine. It took about 4 seconds for the bad query
Very interesting. Thank you @johnf for running the test! It confirms there is definitely something going on with those joins. We'll look into this some more and keep you posted.
Hey @johnf 👋
We've done some investigating here and believe for Postgres the join conditions which result in joining on empty strings (''
) could have some adverse effects on Postgres which is what you're seeing.
We have a potential solution which we've applied in the following package version:
packages:
- git: https://github.com/fivetran/dbt_apple_store.git
revision: feature/union-data-performance-enhancement
warn-unpinned: false
You can see the changes coming from this branch here from the source package and here from the transform.
Please note if you end up testing this out you'll probably want to turn off your app reporting dependency since this will cause a dependency issue. Let me know if you have any problems.
If you want, you can install the above package version and run the following command since I only made the relevant changes to the apple_store__territory_report
upstream models.
dbt run -s +apple_store__territory_report
Let me know if this helps the performance!
@fivetran-joemarkiewicz
I'm just tested this and I think it has the same problem.
This is the query that ran is below. Still running after 10 minutes
FYI If it's possible on your end then happy for you to connect straight to the DB to test. I can give you SSH to my proxy server where I'm testing if that helps
query | /* {"app": "dbt", "dbt_version": "1.7.17", "profile_name": "gladly", "target_name": "prod", "node_id": "model.apple_store.apple_store__territory_report"} */+
| +
| +
| +
| +
| create table "dwh_prod"."dbt_apple_store"."apple_store__territory_report__dbt_tmp" +
| +
| +
| as +
| +
| ( +
| with app as ( +
| +
| select * +
| from "dwh_prod"."dbt_apple_store_stg"."stg_apple_store__app" +
| ), +
| +
| app_store_territory as ( +
| +
| select * +
| from "dwh_prod"."dbt_apple_store_stg"."stg_apple_store__app_store_territory" +
| ), +
| +
| country_codes as ( +
| +
| select * +
| from "dwh_prod"."dbt_apple_store_source"."apple_store_country_codes" +
| ), +
| +
| downloads_territory as ( +
| +
| select * +
| from "dwh_prod"."dbt_apple_store_stg"."stg_apple_store__downloads_territory" +
| ), +
| +
| usage_territory as ( +
| +
| select * +
| from "dwh_prod"."dbt_apple_store_stg"."stg_apple_store__usage_territory" +
| ), +
| +
| reporting_grain as ( +
| +
| select distinct +
| source_relation, +
| date_day, +
| app_id, +
| source_type, +
| territory +
| from app_store_territory +
| ), +
| +
| joined as ( +
| +
| select +
| reporting_grain.source_relation, +
| reporting_grain.date_day, +
| reporting_grain.app_id, +
| app.app_name, +
| reporting_grain.source_type, +
| reporting_grain.territory as territory_long, +
| coalesce(official_country_codes.country_code_alpha_2, alternative_country_codes.country_code_alpha_2) as territory_short, +
| coalesce(official_country_codes.region, alternative_country_codes.region) as region, +
| coalesce(official_country_codes.sub_region, alternative_country_codes.sub_region) as sub_region, +
| coalesce(app_store_territory.impressions, 0) as impressions, +
| coalesce(app_store_territory.impressions_unique_device, 0) as impressions_unique_device, +
| coalesce(app_store_territory.page_views, 0) as page_views, +
| coalesce(app_store_territory.page_views_unique_device, 0) as page_views_unique_device, +
| coalesce(downloads_territory.first_time_downloads, 0) as first_time_downloads, +
| coalesce(downloads_territory.redownloads, 0) as redownloads, +
| coalesce(downloads_territory.total_downloads, 0) as total_downloads, +
| coalesce(usage_territory.active_devices, 0) as active_devices, +
| coalesce(usage_territory.active_devices_last_30_days, 0) as active_devices_last_30_days, +
| coalesce(usage_territory.deletions, 0) as deletions, +
| coalesce(usage_territory.installations, 0) as installations, +
| coalesce(usage_territory.sessions, 0) as sessions +
| from reporting_grain +
| left join app +
| on reporting_grain.app_id = app.app_id +
| and reporting_grain.source_relation = app.source_relation +
| left join app_store_territory +
| on reporting_grain.date_day = app_store_territory.date_day +
| and reporting_grain.source_relation = app_store_territory.source_relation +
| and reporting_grain.app_id = app_store_territory.app_id +
| and reporting_grain.source_type = app_store_territory.source_type +
| and reporting_grain.territory = app_store_territory.territory +
| left join downloads_territory +
| on reporting_grain.date_day = downloads_territory.date_day +
| and reporting_grain.source_relation = downloads_territory.source_relation +
| and reporting_grain.app_id = downloads_territory.app_id +
| and reporting_grain.source_type = downloads_territory.source_type +
| and reporting_grain.territory = downloads_territory.territory +
| left join usage_territory +
| on reporting_grain.date_day = usage_territory.date_day +
| and reporting_grain.source_relation = usage_territory.source_relation +
| and reporting_grain.app_id = usage_territory.app_id +
| and reporting_grain.source_type = usage_territory.source_type +
| and reporting_grain.territory = usage_territory.territory +
| left join country_codes as official_country_codes +
| on reporting_grain.territory = official_country_codes.country_name +
| left join country_codes as alternative_country_codes +
| on reporting_grain.territory = alternative_country_codes.alternative_country_name +
| ) +
| +
| select * +
| from joined +
| ); +
|
Hey @johnf thanks for sharing and sorry that didn't seem to do the trick.
The above was actually my last ditch effort to see if we could avoid the issues we are seeing with constant expressions causing the query planner in Postgres to explode and take significantly longer than necessary. Since the above didn't work, we are going to plan to move forward with the approach where we remove the join condition only if the union schema feature is not being used. As @fivetran-catfritz shared in the branch above, we can confirm removing the join condition when not necessary will address the query performance issue.
We are going to move forward with making this update to the Apple Store and Google Play models. What we envision this update to be is something along the lines of the following whenever we join on source_relation
.
from reporting_grain
left join app
on reporting_grain.app_id = app.app_id
{{ "and reporting_grain.source_relation = app.source_relation" if var('apple_store_union_schemas',[]) or var('apple_store_union_databases',[]) }}
This way we are excluding the join if the defined variables are empty (not using the union data feature). Ideally, we can consolidate this into a single macro that can be referenced in each model so the code is dryer, but this is what we will likely apply to these models and others going forward.
I see you originally mentioned you were open to contributing a PR to address this issue. Let me know if that is still of interest to you and I'd be happy to help facilitate the updates. Otherwise, we will add this to our upcoming sprint to address. Thanks!
@fivetran-joemarkiewicz sounds great. Let me know if you need any more testing
Hi @johnf, I wanted to post back because we have been able to spend some time thinking through this issue/solution and we don't feel the solution initially proposed (removing the join condition for non union_data environments) is scalable. This feature is applied to a large number of our existing dbt packages, and making this config change to every join is not something which will scale appropriately.
We have a few other ideas:
source_relation
field to be a composite key to ensure it isn't a constant expression and therefore causing performance issues within the query planner.source_relation
field as a constant expression.Unfortunately, our test Postgres instance isn't representing a live environment and it's difficult to validate if the changes we are testing will properly address the issue. I recall you were open to sharing your Postgres instance to help with this bug fix. Before then, would you by chance be available for a 30 minute video call to explore this further and allow us to test our assumptions with you? Let me know if you would be interested. Thanks!
@fivetran-joemarkiewicz Sure happy to help! I just connected with you on LinkedIn so we can organise a time
Thanks so much for meeting with me yesterday @johnf to troubleshoot this issue!
Following our meeting I believe we came to an understanding of where the issue was truly originating. Additionally, I feel I found a viable and appropriate solution. If you would like, you can attempt to run the troublesome model using the below package version.
- git: https://github.com/fivetran/dbt_apple_store.git
revision: bugfix/postgres-performance-updates
warn-unpinned: false
A short summary of what the issue was and why I believe the above changes will solve the issue:
app
source table only had one record (therefore only 1 app_id). This then resulted in this join essentially being unnecessary since both source_relation
and app_id
in your case are all constant expressions (the same value for all records). This was not an issue before (pre source_relation
) because Postgres is able to handle the load to address one constant expression in a join. However, once we add two in this case it seems to become overloaded.Ideally, we would just not include the joins at all in your scenario. However, for other customers the app_id
and source_relation
fields are not constant expressions and therefore do need to be included. As a result, I believe the above approach is a viable, appropriate, and scalable solution which will be relevant to all users of this package. Additionally, this is a great learning for how we should be structuring our intermediate models when dealing with possible constant expression joins in other packages.
I was able to reproduce the error you are seeing and also see this branch resolve the error. On my end I was able to get the territory report to go from never finishing, to completing in just 10 seconds (which seems similar to what you were experiencing in the past). I made similar updates to all the other end models so you should see this issue resolved across this package. Please let me know if this does the trick! If not, we can explore other options to address this performance issue.
Thanks again!
@fivetran-joemarkiewicz Unfortunately this didn't help.
But I discovered something interesting. If I run the dbt command it hangs. If I grab the exact same query from --debug and run it manually from psql it finishes in < 30 seconds.
Does dbt do anything special when it's talking to postgres?
I'm sorry to hear @johnf this didn't resolve your issue 😞
Can you confirm that when running the new package version, do you see any new tables materialized in your destination? The new version should have created a these new tables which was intended to help with the query performance. If those tables weren't created then there may be something in the config that needs to be adjusted so they can materialize and you can benefit from the performance improvement.
I don't believe dbt does anything on the backend other than what you see in the target/run folder 🤔. I did encounter an issue when testing where when I ran the original broken version right before running the new version, it did slow down my dbt job execution time. I essentially needed to let the cpu usage level out before attempting the newer version, and then it would work without issue each subsequent run on the new version. I'm not sure if this is the same sequence you took, but wanted to share in case it was.
@johnf just wanted to bump my previous message.
@fivetran-joemarkiewicz Thanks for the reminder. I just cleared out dbt_packages and retried and it works!!
Maybe a caching issue in packages or psql last time??
Thanks for making it happen.
21:49:47 Found 60 models, 1 seed, 20 tests, 22 sources, 0 exposures, 0 metrics, 650 macros, 0 groups, 0 semantic models
21:49:47
21:49:48 Concurrency: 1 threads (target='prod')
21:49:48
21:49:48 1 of 10 START sql view model dbt_apple_store_stg.stg_apple_store__app_store_territory_tmp [RUN]
21:49:48 1 of 10 OK created sql view model dbt_apple_store_stg.stg_apple_store__app_store_territory_tmp [CREATE VIEW in 0.20s]
21:49:48 2 of 10 START sql view model dbt_apple_store_stg.stg_apple_store__app_tmp ...... [RUN]
21:49:48 2 of 10 OK created sql view model dbt_apple_store_stg.stg_apple_store__app_tmp . [CREATE VIEW in 0.07s]
21:49:48 3 of 10 START sql view model dbt_apple_store_stg.stg_apple_store__downloads_territory_tmp [RUN]
21:49:48 3 of 10 OK created sql view model dbt_apple_store_stg.stg_apple_store__downloads_territory_tmp [CREATE VIEW in 0.07s]
21:49:48 4 of 10 START sql view model dbt_apple_store_stg.stg_apple_store__usage_territory_tmp [RUN]
21:49:48 4 of 10 OK created sql view model dbt_apple_store_stg.stg_apple_store__usage_territory_tmp [CREATE VIEW in 0.07s]
21:49:48 5 of 10 START sql table model dbt_apple_store_stg.stg_apple_store__app_store_territory [RUN]
21:49:50 5 of 10 OK created sql table model dbt_apple_store_stg.stg_apple_store__app_store_territory [SELECT 196745 in 1.41s]
21:49:50 6 of 10 START sql table model dbt_apple_store_stg.stg_apple_store__app ......... [RUN]
21:49:50 6 of 10 OK created sql table model dbt_apple_store_stg.stg_apple_store__app .... [SELECT 1 in 0.10s]
21:49:50 7 of 10 START sql table model dbt_apple_store_stg.stg_apple_store__downloads_territory [RUN]
21:49:52 7 of 10 OK created sql table model dbt_apple_store_stg.stg_apple_store__downloads_territory [SELECT 197106 in 1.84s]
21:49:52 8 of 10 START sql table model dbt_apple_store_stg.stg_apple_store__usage_territory [RUN]
21:49:52 8 of 10 OK created sql table model dbt_apple_store_stg.stg_apple_store__usage_territory [SELECT 12731 in 0.14s]
21:49:52 9 of 10 START sql table model dbt_apple_store.int_apple_store__territory_report_grain [RUN]
21:49:53 9 of 10 OK created sql table model dbt_apple_store.int_apple_store__territory_report_grain [SELECT 196745 in 1.02s]
21:49:53 10 of 10 START sql table model dbt_apple_store.apple_store__territory_report ... [RUN]
21:50:02 10 of 10 OK created sql table model dbt_apple_store.apple_store__territory_report [SELECT 196745 in 9.52s]
21:50:02
21:50:02 Finished running 4 view models, 6 table models in 0 hours 0 minutes and 14.95 seconds (14.95s).
21:50:02
21:50:02 Completed successfully
21:50:02
21:50:02 Done. PASS=10 WARN=0 ERROR=0 SKIP=0 TOTAL=10
That's great news and thank you for confirming! We will be moving forward with this change in the current sprint.
Hi @johnf thanks again for all your help with this issue. However, last week there was a change to the Apple Store Fivetran connector which is resulting in the majority of the reports this dbt package references are no longer syncing up to date data.
The connector now has a new version which you can switch to in order to obtain the same information. However, this new connector is using a different schema from the one this package is built off. Therefore, we will need to modify this package to support the new schema. You can see more details within Issue #28 but you should have also received an email last Friday detailing the changes.
The above being said, we will be prioritize updating the dbt package to support the new schema. Fortunately, we have great learnings from this thread which we will apply to the updates in the coming weeks. I'll be sure to keep you in the loop once the updates are applied so you can confirm the same performance gains we achieved in the previously shared branch are retained within the coming update.
Thank you and we really appreciate your patience.
Is there an existing issue for this?
Describe the issue
Around about the 2nd August I upgraded (all packages and dbt) but in particular this package from 0.3.0 to 0.4.0
Since then the load on my small PostgreSQL instance has gone from averaging at 11% to 90%
What I see is the query below running for hours
The only tables in this query with significant rows are the territory tables but they are circa 124000 rows only.
I'm running PostgreSQL in Azure. I'm no DB expert so haven't been really able to dig into this.
Happy to debug and help as needed or be told I'm doing it wrong :)
Relevant error log or model output
Expected behavior
Query completes within a reasonable time
dbt Project configurations
Package versions
What database are you using dbt with?
postgres
dbt Version
1.7.17 (Note I'm also using dbt transformations, same issue in both places)
Additional Context
No response
Are you willing to open a PR to help address this issue?