fivetran / dbt_hubspot_source

Data models for Hubspot built using dbt.
https://fivetran.github.io/dbt_hubspot_source/
Apache License 2.0
33 stars 31 forks source link

[Feature] Support env_var usage to declare hubspot_database #130

Open erenmirza opened 2 weeks ago

erenmirza commented 2 weeks ago

Is there an existing feature request for this?

Describe the Feature

Currently, the source hubspot database can be specified for the dbt hubspot packages as follows:

vars:
    hubspot_database: your_destination_name
    hubspot_schema: your_schema_name 

We have a non-production version of hubspot and a production version of hubspot which are both synced by Fivetran into separate databases.

Currently Jinja is not supported within the dbt vars config.

As a result, dynamically selecting the correct hubspot database based on environment is not easily achievable. Users would have to override dbt variables with environment variables through the CLI.

Feature: Support env_var usage to declare hubspot_database Scenario: dbt users have nonprod and production instances of hubspot

How would you implement this feature?

In src_hubspot.yml here

Replacing

sources:
  - name: hubspot
    schema: "{{ var('hubspot_schema', 'hubspot') }}"
    database: "{% if target.type != 'spark'%}{{ var('hubspot_database', target.database) }}{% endif %}"

with something that makes use of env_var. Perhaps:

sources:
  - name: hubspot
    schema: "{{ var('hubspot_schema', 'hubspot') }}"
    database: "{% if target.type != 'spark'%}{{ var('hubspot_database', env_var('HUBSPOT_DATABASE', target.database)) }}{% endif %}"

Describe alternatives you've considered

Alternative solutions include

Are you interested in contributing this feature?

Anything else?

No response

fivetran-joemarkiewicz commented 2 weeks ago

Hi @erenmirza thanks for opening this issue!

This is something we have seen similar requests to support env vars in other Fivetran dbt packages. From a technical standpoint I don't see there being any issue with this approach and see how this can offer a new level of flexibility to users of the dbt package. I do have a few questions for you before proceeding:

In addition to the above, I will need to confirm that this config will not cause any unforeseen issues and works as expected with the various supported orchestration methods (e.g. dbt Core, dbt Cloud, Fivetran Transformations, Fivetran Quickstart). I'll explore this, but I welcome you to open the PR to add this config in the HubSpot source package as well as an example README section detailing the reason for this feature. If there are no issues with this config in our orchestration methods then I imagine we can consider this for an upcoming release!

Thanks!

erenmirza commented 2 weeks ago

Hey @fivetran-joemarkiewicz

Thanks for you response, nice to hear this isn't something we've asked for independently!

In our case, the schema names are the same. However I agree that there is definitely a case for make the schema variable set through the env_var config. The use cases that come to mind:

More than happy to contribute a PR when I have some free time to include:

Keep me posted regarding if there are any issues downstream that you find.

fivetran-joemarkiewicz commented 2 weeks ago

@erenmirza I just checked with our team and it doesn't look like there are any downstream concerns with adding this feature into a future release!

At the moment we will not be prioritizing this enhancement. However, if you are willing to open a PR then we would be happy to review it and consider the updates in an upcoming release. Let me know if you have any questions if you decide to open a PR. Thanks!