fivetran / dbt_stripe_source

Fivetran's Stripe source dbt package
https://fivetran.github.io/dbt_stripe_source/
Apache License 2.0
8 stars 29 forks source link

Feedback needed #15

Closed dimoschi closed 3 years ago

dimoschi commented 3 years ago

https://github.com/fivetran/dbt_stripe_source/blob/ab460557d6d1288f66d5fbe7159fddebb9db0b28/models/stg_stripe__invoice_line_item.sql#L48

Is is possible to provide some feedback on this staging model and the exclusion of invoice line items starting with sub_? Trying to find out in Stripe's documentation, I wasn't able to find something relevant. Also according to Stripe:

There are actually 3 types of ID that can appear in the invoice_line_item. What type of ID appears depends on where the item has been added.

sub - The ID associated with a subscription on a customer object. si - The ID associated with a subscription item. Typically used for multiplan subscriptions, where the plan becomes the subscription item. sli_ - The ID associated with a subscription line item. It essentially defines a subscription item on an invoice.

So, it seems that invoice line items starting with sub_ are related to subscriptions, so in stripe__subscription_line_items model of dbt_stripe those are excluded even thought it seems like they shouldn't.

fivetran-joemarkiewicz commented 3 years ago

Hi @dimoschi this is a great question! Would you be able to provide the link to the quoted Stripe documentation around the invoice_line_item id types? I seem to be unable to find this.

Our original understanding of the invoice line items with sub_ ids are temporary and are then replaced with ids starting with sli. Does this seem to not be the case with your data?

dimoschi commented 3 years ago

@fivetran-joemarkiewicz the quote above is from an email exchange with Stripe asking for more information on invoice_line_items and I'm currently waiting for their response with more information and possibly for a documentation link, as I wasn't able to find anything too.

From my experience with our data, it seems that sub_ is not always replaced with sli_. Until we can find more rigid information, I have forked your repo and removed the where statement, as it leads to incorrect conclusions.

Gonna keep you updated!

fivetran-joemarkiewicz commented 3 years ago

Thanks for this additional context @dimoschi! I am glad I am not alone in struggling to find this documentation. I am interested to hear what they provide to you, and then we can better document/update this package once obtaining more details from Stripe.

Look forward to hearing more from Stripe 👍

fivetran-joemarkiewicz commented 3 years ago

Hi @dimoschi, were you able to get a response from Stripe?

dimoschi commented 3 years ago

Hey @fivetran-joemarkiewicz unfortunately for us, I don't have any solid answer yet. Will keep you posted, but please can you contact Stripe, maybe they can provide you with a better answer.

fivetran-joemarkiewicz commented 3 years ago

Thanks for touching base on this @dimoschi. Sorry to hear you haven't received any answer yet 😞

I will reach out on my end and see if we can find any understanding. I will keep you posted here.

fivetran-joemarkiewicz commented 3 years ago

@dimoschi I was able to loop up with our product team and came to an understanding of this.

Essentially, the reason for this addition was due to a legacy issue with Stripe. Stripe duplicates some records with both sub_ and sli_ id prefixes, which used to be our primary key. We ignored everything starting with sub_, until we introduced a new primary key called unique_id , at which point we started syncing sub_ records again. so if unique_id is part of the dbt model, we can probably remove the sub_ filter. With this I will add a task for our team to work through removing the filter and adding the unique_id to our Stripe package.

I will post back here once we start working on this so you can give it a try before release. Thanks!

dimoschi commented 3 years ago

@fivetran-joemarkiewicz thank you for the explanation provided and hopefully this can be resolved soon, as I am required to fork and change that piece of code to make it work for us.

However, shouldn't this package work regardless the integration between Fivetran and Stripe?

fivetran-joemarkiewicz commented 3 years ago

Hi @dimoschi this package is designed to work with Stripe data that is synced using Fivetran. It could be used outside of the realm of Fivetran, but would require some rework if the schema is not the same.

I apologize as I have not been able to look into this issue as much as I would like and we are limited as we do not have access to live Stripe data which serves as a development hurdle. As to be safe to not break existing users reports and so you may be able to not fork this repo I will add a variable to this package which will allow you to turn these filters off and still include the sub records. I should have a working branch ready in a bit which you may try out. If that solution works for you then we can move forward with cutting a new release.

Thanks!

fivetran-joemarkiewicz commented 3 years ago

@dimoschi I have just created a branch feature/sub-filter-removed which will allow you to disable this filter. Would you be able to use the below config in your packages.yml:

packages:
    - git: https://github.com/fivetran/dbt_stripe_source.git
      revision: feature/sub-filter-removed
      warn-unpinned: false

as well as add the below variable config to your dbt_project.yml:

vars:
  using_invoice_line_sub_filter: False

Please let me know if this works for your use case. If so, we can move forward with adding this to the next release of this package.

Thank you!

dimoschi commented 3 years ago

@fivetran-joemarkiewicz thank you for the feature branch. I'll try it tomorrow morning and will provide my feedback.

No need to apologise, I understand that you have a lot of things going on and I appreciate that you took the time to resolve the issue.

fivetran-joemarkiewicz commented 3 years ago

@dimoschi thanks again for trying this out! This variable has been included in the v0.4.2 release of this package which should be live on the dbt hub at the top of the hour.

Please feel free to open a new issue if you have any other questions 😄