fivetran / dbt_shopify

Fivetran's Shopify dbt package
https://fivetran.github.io/dbt_shopify/
Apache License 2.0
51 stars 39 forks source link

Error in models/shopify__transactions.sql Related to Calculated Exchange Rates for Refunds #69

Open jmussitsch opened 8 months ago

jmussitsch commented 8 months ago

Is there an existing issue for this?

Describe the issue

The exchange rate related columns are calculated via this code:

    select
        *,
        coalesce(cast(nullif({{ fivetran_utils.json_parse("receipt",["charges","data",0,"balance_transaction","exchange_rate"]) }}, '') as {{ dbt.type_numeric() }} ),1) as exchange_rate,
        coalesce(cast(nullif({{ fivetran_utils.json_parse("receipt",["charges","data",0,"balance_transaction","exchange_rate"]) }}, '') as {{ dbt.type_numeric() }} ),1) * amount as currency_exchange_calculated_amount
    from joined

However, the structure of the JSON object in the receipt column is different for refunds. This leads to the exchange_rate field always falling back to the coalesced value of 1 for refunds transactions. This is subtle for conversions close to 1.0 but become very problematic for conversions rates that are not close to 1.0. For example, CRC which has a rate of ~ 0.0019 USD to 1 CRC.

Relevant error log or model output

Here are examples of the JSON for captures versus refunds (I removed all non-relevant fields, etc.):

capture

{
  "charges": {
    "data": [
      {
        "balance_transaction": {
          "exchange_rate": 0.00184531
        }
      }
    ]
  }
}

refund

{
  "balance_transaction": {
    "exchange_rate": 0.00187629,
    "id": "txn_12345",
    "object": "balance_transaction"
  },
  "charge": {
    "balance_transaction": "txn_12345"
  }
}

### Expected behavior

The expected behavior is to calculate the correct exchange rates, etc. for refund transactions.

### dbt Project configurations

name: chabi version: 1.0.0 config-version: 2 profile: '{{ env_var(''DBT_PROFILE'', '''') }}' model-paths:

Package versions

packages:
- package: dbt-labs/codegen
  version: 0.9.0
- package: dbt-labs/dbt_utils
  version:
  - '>=1.0.0'
  - <2.0.0
- package: fivetran/ad_reporting
  version: 1.5.0
- package: fivetran/klaviyo
  version: 0.5.0
- package: fivetran/shopify
  version: 0.8.1
- package: fivetran/shopify_holistic_reporting
  version: 0.4.0

What database are you using dbt with?

snowflake

dbt Version

Core:
  - installed: 1.5.0
  - latest:    1.7.2 - Update available!

  Your version of dbt-core is out of date!
  You can find instructions for upgrading here:
  https://docs.getdbt.com/docs/installation

Plugins:
  - snowflake: 1.5.0 - Update available!

  At least one plugin is out of date or incompatible with dbt-core.
  You can find instructions for upgrading here:
  https://docs.getdbt.com/docs/installation

Additional Context

No response

Are you willing to open a PR to help address this issue?

fivetran-joemarkiewicz commented 8 months ago

Hi @jmussitsch thank you so much for opening this issue and raising the differences with the refunds JSON object to our team.

This is something we will want to address in our package. Is it consistent with your data that all refunds have a corresponding refund_id in this table? If so, we may be able to address this issue by expanding the logic to be a case when statement to leverage a new JSON parse for the refund object if the refund_id is not null, otherwise we do the original JSON parse for non refunds. If that is not consistent we may also attempt to just insert another coalesce argument which parses the refund object. However, I would like to avoid too much nested coalesce statements if possible. What are your thoughts?

fivetran-reneeli commented 8 months ago

Hi @jmussitsch! We will be folding this into an upcoming sprint and appreciate you flagging this. Just wanted to see if you had any reservations about what Joe suggested above as potential solutions?

jmussitsch commented 8 months ago

Hi @fivetran-joemarkiewicz @fivetran-reneeli ,

I can confirm that running the following SQL yields no rows:

select *
from SHOPIFY__TRANSACTIONS
where KIND = 'refund'
  and REFUND_ID is null;

So seems like there is always a refund_id set for refunds. However, I just realized another problem. That logic re JSON structure only works for the stripe payment gateway. Here is a JSON object for the receipt column for paypal:

{
  "Ack": "Success",
  "Build": "***",
  "CorrelationID": "***",
  "FeeRefundAmount": "0.00",
  "GrossRefundAmount": "11776.28",
  "NetRefundAmount": "11776.28",
  "RefundInfo": {
    "PendingReason": "none",
    "RefundStatus": "Instant"
  },
  "RefundTransactionID": "***",
  "Timestamp": "2023-06-14T21:39:28Z",
  "TotalRefundedAmount": "11776.28",
  "Version": "124",
  "ack": "Success",
  "build": "***",
  "correlation_id": "***",
  "fee_refund_amount": "0.00",
  "fee_refund_amount_currency_id": "MXN",
  "gross_refund_amount": "11776.28",
  "gross_refund_amount_currency_id": "MXN",
  "net_refund_amount": "11776.28",
  "net_refund_amount_currency_id": "MXN",
  "pending_reason": "none",
  "refund_status": "Instant",
  "refund_transaction_id": "***",
  "timestamp": "2023-06-14T21:39:28Z",
  "total_refunded_amount": "11776.28",
  "total_refunded_amount_currency_id": "MXN",
  "version": "124"
}

I would think this would be an issue as well....

fivetran-reneeli commented 7 months ago

Hi @jmussitsch thanks for checking.

Hmm, so looks like json structure for receipt varies by payment gateway. And exchange_rate is necessary to bring into the model because it helps standardize & convert amounts from customer currency into shop currency, so it's definitely a field we need to retain and make sure is persisting correctly.

Ideally we can provide full coverage of exchange rates across all different payment gateways. This looks to be more nuanced than originally expected. My main question is regarding gateways, such as the Paypal one you've shown, that don't provide an exchange rate (based on the json you've shared I'm not seeing it), how do you calculate your shop currency's amount then?

Secondly, other than using exchange_rate to convert transaction amounts to your local currency, are there other use cases? It seems exchange_rate is an important field that we should keep in the models, but being that the json object varies a lot across gateways-- we're contemplating using a regex method to capture it instead, but curious to know your thoughts.

fivetran-reneeli commented 6 months ago

Hi @jmussitsch, we are picking this up this sprint! Wanted to ask-- if you divided currency_exchange_final_amount by amount, technically wouldn't that result in the exchange rate? Since currency_exchange_final_amount is in shop's currency and amount is in local (presentment) currency I believe, I'm thinking dividing these two would result in the exchange rate used at the time of the transaction.

So for CRC, if done this way, would you get a rate close to ~ 0.0019 USD to 1 CRC?

fivetran-reneeli commented 6 months ago

Hi @jmussitsch , just checking up on the above. I'd be curious to know if the currency_exchange_* fields in your raw transactions table populate (if not, do they show up in Shopify UX? That would be then an issue related to our connector.)

And secondly, if you do have those fields, I want to see if the above method would work in calculating exchange rate.