Closed rick1290 closed 5 months ago
Hi @rick1290 thank you for opening this issue. We will review this in the next day or so and let you know if we have any questions.
Thanks! @fivetran-catfritz Will stay posted :)
Hi @rick1290 I am reviewing this and have a couple questions. Could you confirm if you have the source table USER_UNSUBSCRIBED_CHANNEL
with the column _fivetran_id
? There was a schema change on August 10, 2023, so it would be helpful to know if you're on the old schema or the new schema.
Also I see that you are on v0.10.0 of the package. I don't think upgrading to the current version v0.11.0 will solve this particular issue, but I'm also curious if you have tried it out/had issues with it?
@fivetran-catfritz I'm on the old schema
Agreed - don't think upgrading as well will help. I've been upgrading over time and havn't seen any changes. I can try when I get some time next week.
Hi there @rick1290 -- this latest_batch_index
window function is likely the culprit, as it will return the latest unsubscribed timestamp for each user across all channels/message types, and we use it to filter out records here.
I believe adding channel_id
or message_type_id
to the partition clause of the window functions would resolve the issue, mostly. This would then return the latest unsubscription timestamp per user per channel/message type. I am wondering if it makes sense to return the latest timestamp (in case a user has unsubscribed, re-subscribed, and unsubscribed again from the same channel) or instead return the first unsubscription timestamp?
As for your later point
Its also mixing subscriptions to lists vs. channels/messagetypes when it comes to dates. Makes this table difficult to work with
Would you mind clarifying a bit here?
@fivetran-jamie Great!
latest unsubscription timestamp per user per channel/message type
The above is the logic we want for this.
In regards to clarifying - we just want to make sure that the timestamps are for channel/message types - and not when a user drops from a list.
Really glad you are working on a fix for this!
Gotcha cool! Would you be willing to test out a working branch when we have a fix drafted?
Yep :)
@fivetran-jamie Any update or progress on this issue :)
Thanks1
Hi @rick1290 ! I've currently taken on the work on this bug from @fivetran-jamie and am hoping to have a fix ready by the end of this sprint.
I've deployed the proposed changes in the dbt_iterable_source
package that should bring in the latest unsubscription behavior for an email based on channel_id
and message_type_id
, and should bring in the latest unsubscribes on both the message type and channel level.
You can test the changes before we go ahead and deploy them alive. Would you be able to update the packages.yml
file in your dbt_iterable
package to:
packages:
- git: https://github.com/fivetran/dbt_iterable_source.git
revision: bugfix/catch-all-recent-unsubscribes
warn-unpinned: false
Then run your package and see if everything looks as proposed? This will provide the final level of validation needed for us to merge this into a new release!
Thanks for your patience.
Hi @rick1290 ! This new version of dbt_iterable_source
is now live in our latest v0.8.1 release with the changes that should bring in all the latest unsubscribes in the iterable__user_unsubscriptions
model. You can run dbt deps
to pick up the latest version of the package.
Let us know if you experience any issues here!
@fivetran-avinash Thanks! Do I need to run a full refresh or anything?
@fivetran-avinash
I get the following errors when I run a ful-refresh. Mind you - our data is from 2021 - so we are on legacy schema. Have the new updates been tested on the original schema? I read the breaking changes - but not sure it should be causing errors.
Thanks!!
02:45:43 Database Error in model int_iterable__latest_user (models/intermediate/int_iterable__latest_user.sql)
Unrecognized name: unique_user_key at [13:38]
compiled Code at target/run/iterable/models/intermediate/int_iterable__latest_user.sql
02:45:43
02:45:43
02:45:43 Database Error in model int_iterable__list_user_history (models/intermediate/int_iterable__list_user_history.sql)
Unrecognized name: unique_user_key at [23:47]
compiled Code at target/run/iterable/models/intermediate/int_iterable__list_user_history.sql
02:45:43
02:45:43
02:45:43 Runtime Error in model iterable__user_unsubscriptions (models/iterable__user_unsubscriptions.sql)
404 Not found: Table tsn-marketing-operations:dbt_models_stg_iterable.stg_iterable__user_unsubscribed_channel was not found in location US; reason: notFound, message: Not found: Table tsn-marketing-operations:dbt_models_stg_iterable.stg_iterable__user_unsubscribed_channel was not found in location US
Just reverted back to v10 and its working. Would love to see how I can get v11 working with these awesome new changes :)
Morning @rick1290 ! Are you able to see what version of fivetran/dbt_iterable_source
the v0.11.1
version of dbt_iterable
is picking up when you run dbt deps
? I'm wondering if it's looking at an older version of the package without unique_user_key
.
Morning @fivetran-avinash Please see the below:
15:52:29 Running with dbt=1.6.14
15:52:30 Installing fivetran/iterable
15:52:31 Installed from version 0.11.0
15:52:31 Up to date!
Hey @rick1290, @fivetran-avinash is OOO for the rest of the week, so I'll jump in to help for the time being.
It looks like the only contents we see there are for the fivetran/iterable
dbt package, but not the fivetran/iterable_source
package (where Avinash's changes were made). 🤔
Would you be able to share the contents of your packages.yml
?
@fivetran-joemarkiewicz See below.
Hey @rick1290 it looks like you shared the contents of the dbt_project.yml
. Would you be able to share the packages.yml
contents? Additionally, do you by chance have a package-lock.yml
in your project? You are correct that the source package should be installed automatically, but it doesn't seem to be the case as we don't see the source installation in your above response.
@fivetran-joemarkiewicz Oops.
`packages:
package: fivetran/iterable version: 0.11.0
package: fivetran/ad_reporting version: [">=1.6.0", "<1.7.0"] # we recommend using ranges to capture non-breaking changes automatically
package: fivetran/social_media_reporting version: 0.2.0
package: fivetran/youtube_analytics version: 0.3.0
package: narratorai/activity_schema version: 0.1.0
package: dbt-labs/audit_helper version: 0.9.0
package: mjirv/dbt_product_analytics version: [">=0.1.0"]
package: fivetran/amplitude version: [">=0.4.0", "<0.5.0"] # we recommend using ranges to capture non-breaking changes automatically`
Thanks @rick1290, and do you by chance see a package-lock.yml
file in your project?
@fivetran-joemarkiewicz No lock.yml in the project
Any other ideas?
Thanks @rick1290, and just to confirm you do not see fivetran/iterable_source
anywhere in your dbt deps log?
Also, when you full refreshed, did you run dbt run -s +iterable --full-refresh
? It doesn't seem like the upstream changes were captured in the run where you see the error. I can mainly see this due to the error message that stg_iterable__user_unsubscribed_channel
does not exist. This is a renamed model from the v0.8.0 dbt_iterable_source release. On a full refresh (while also running the upstream models) this error shouldn't occur.
One last attempt here would be to try the following steps:
dbt clean
dbt deps
(confirm iterable_source v0.8.1 is installed)dbt run -s +iterable --full-refresh
Ideally after these steps you should see success. Let me know if this works!
@fivetran-avinash @fivetran-joemarkiewicz
The above worked - thanks!
That said - take a look at the following screenshot. I'm still seeing the same original bug on the new tables. The updated_at date is being attached to ALL channels and messages. The updated_at date should be the date the user unsubscribed from the channel or message_type. Should be a unique date_time for each channel if they were unsubscribed individually.
Below is an example of a user actually 'subscribing' to the channel/type and not unsubscribing on 5-10-24 but the channel is included in the unsubscribe table along with the date subscribed.
I'm wondering if we need another column in here - that is boolean - that lets us know if they are current unsubscribed from the message_type_name? Any other ideas?
An example of a user who unsubscribed 2-27-24 then resubscribed on 4-10-24. The 6-05-24 was attached to no subscription action.
I'm thinking subscribe may be getting mixed in here somehow - and something else I can't figure out for the 6-05 date stamp.
Hi @rick1290 , I'm wondering if your run is still utilizing the old version of the iterable_source
package. If this data you provided is all reflective of one user, the new partitions should be creating unique indices based off of the email/channel_id
and email/message_type_id
mode and taking those unique updated_at
times.
Are you able to view the compiled code on the new tables? It should be in your target/compiled folder. Can you check and see if the new compiled code for stg_iterable__user_unsub_message_type
and stg_iterable__user_unsubscribed_channel
matches the new partition logic on the v0.8.1 version?
Below is an example of a user actually 'subscribing' to the channel/type and not unsubscribing on 5-10-24 but the channel is included in the unsubscribe table along with the date subscribed.
I'm wondering if we need another column in here - that is boolean - that lets us know if they are current unsubscribed from the message_type_name? Any other ideas?
Interesting findings @rick1290 ! These last two messages feels like a separate issue though--why is any subscription data appearing in the unsubscribe tables? That feels like a connector issue. This is worth filing a support ticket with our Fivetran team to see why subscription data is appearing at all.
@fivetran-avinash
Hi @rick1290 , I'm wondering if your run is still utilizing the old version of the
iterable_source
package. If this data you provided is all reflective of one user, the new partitions should be creating unique indices based off of theemail/channel_id
andemail/message_type_id
mode and taking those uniqueupdated_at
times.Are you able to view the compiled code on the new tables? It should be in your target/compiled folder. Can you check and see if the new compiled code for
stg_iterable__user_unsub_message_type
andstg_iterable__user_unsubscribed_channel
matches the new partition logic on the v0.8.1 version?
It matches - confirmed.
Below is an example of a user actually 'subscribing' to the channel/type and not unsubscribing on 5-10-24 but the channel is included in the unsubscribe table along with the date subscribed.
I'm wondering if we need another column in here - that is boolean - that lets us know if they are current unsubscribed from the message_type_name? Any other ideas?
Interesting findings @rick1290 ! These last two messages feels like a separate issue though--why is any subscription data appearing in the unsubscribe tables? That feels like a connector issue. This is worth filing a support ticket with our Fivetran team to see why subscription data is appearing at all.
I don't believe this is a sync issue as I initially filed a ticket which led me to filing the issue here as I believe this is tied to the underlying data models being run in this package. When I look at custom events in the iterable_event table - all those unsubscribe and subscribe events are correct. For some reason though - the subscribe event date/time is being added to the updated_at field in this table.
Happy to hop on a quick call to discuss this more :) I'm going to be thrilled once this use-case can be solved by the package.
I'm now spot checking the user_unsubscribed_channel_history table... could be a sync issue....
These dates don't make any sense :( :(
Really unsure how this data is gathered - i'm thinking it might make more sense to use the iterable_event table --- and utilize those unsubscribe and subscribe events to identify the dates - and if they are activiely unsubscribed.
I was thinking maybe the new connector could solve this - but i'm not seeing any date stamps on the unsubscribe tables. Filed a ticket in the meantime - thanks for the help on this!
Hi @rick1290, thanks for the thorough investigation! I believe this is still a Fivetran connector issue. The updated_at
value is not properly processing the unsubscribe time events in the way we would expect. That is not something our data models can control for, as we expect the updated_at
value to be consistent for the time of an unsubscribe event.
So to summarize,
event
table is properly bringing in subscribe and unsubscribe events at the time they occur.user_unsubscribed_message_type_history
and user_unsubscribed_channel_history
tables are not getting the proper times from the updated_at
value. Some examples of incorrect updated_at
behavior in these tables in the examples you've shown above:
updated_at
value for an email regardless of channel or message type (Maybe it's taking the most recent event? unsure of the behavior)I'm seeing you created an additional ticket. I would relay all this information in the last few comments you provided in this thread to Fivetran support, particularly the screenshots regarding the edge cases you see. That way they can investigate what's happening in each case thoroughly. Hopefully, it's a bug, and this can be resolved internally.
If the behavior of the connector has changed for the updated_at
values in these history models, then we can schedule a call, reopen this ticket, and take next steps.
Let me know if this all makes sense!
UPDATE: Just saw your latest attempt to try the new connector. Unfortunately the new connector tables are no longer history tables, and will just have the most recent unsubscribe value. I don't think a date value comes with that.
I see - thanks everything.. .very much appreciated!!!
@fivetran-avinash
Still no progress with support on this request unfortunately.
I thought I'd note one thing. I think it would be quite awesome if we can take event_name = 'emailUnSubscribe' and use that event to build an unsubscribe table and then compares it to the user table. Somehow I think this is the only way to get accurate unsubscribe data as the current methods are not accurate.
Hi @rick1290 , good investigation on your side! It'd be interesting to see if we could incorporate this idea to create a more sustainable unsubscription model.
Would you be able to share your data of how using the event_name= emailUnSubscribe
would be able to fill the same role for rebuilding the iterable__user_unsubscriptions
model for your purpose, particularly with the additional grains on channel and message type? Also, is the proper timestamp coming through in all cases in your data?
(You can do it in the support ticket too if it's PII, but just let me know here so I can take a look!)
@fivetran-avinash Yep - this would be awesome to see.
event_name= emailUnSubscribe and emailSubscribe match our data perfectly - so it makes sense to use.
Essentially, you'd use the iterable__event table - to get emailUnSubscribe and emailSubscribe data/events.
UNEST channel_id's and message_types in additional_properties ---> which would create a new row for each email, id + unsubscribe event.
Write a rule to check if emailUnSubscribe date > emailSubscribe date for each email, id combo. This will provide accurate data on IF the user is current unsubscribed or subscribes to a channel or message type.
You can then join this back on the channel group - and find out if they are unsubscribed channel wide.
Happy to do a workshop and discuss with you more. I've gone down a rabbit hole on this and build something for this today --- pretty raw tho - but it works.
@fivetran-avinash Were you able to see the query that was created by the support team in my ticket? If you have time - would love to get your thoughts and see how it can be reflected in the dbt package to make things accurate.
The query they provides works - but i'm still not sure its the best approach - but want to see what you think. Thanks!!
Is there an existing issue for this?
Describe the issue
Note on iterable dbt package.
iterable__user_unsubscriptions:
updated_at Timestamp of when the user unsubscribed from the channel or message type
The above definition for that column is wrong in that table - it seems that the update_at - is the last date of any unsubscription - not the actual date of the unsubscription. So if I unsubsubcribed from X 2 years ago - and the I unsubscribe from Y yesterday. All those rows will be updated to the yesterday date. Its also mixing subscriptions to lists vs. channels/messagetypes when it comes to dates. Makes this table difficult to work with.
https://fivetran.github.io/dbt_iterable/#!/model/model.iterable.iterable__user_unsubscriptions
Relevant error log or model output
Expected behavior
The
updated_date
should be the date of the actual unsubscription datedbt Project configurations
Default package
Package versions
What database are you using dbt with?
bigquery
dbt Version
Most recent
Additional Context
I love these iterable packages - keep them coming :)
These queries are difficult to write so these come in handy. I'd love to see this table fixed tho - this one is quite important to us:)
Are you willing to open a PR to help address this issue?