Closed Analyticminder closed 6 months ago
@dataders , Do you think I should extract database and schema information for each model using generate database_name and generate_schema_name to create schema if it does not exist and use it during table/view materialization?
I looked at snowflake adapter but could not find this. Is there a different way to achieve this using pre-built dbt implementation?
here's my understanding so far.
dbt
has a convention that if you run
(create) a model in a schema that does yet exist. dbt
will create that schema for youprofiles.yml
@prdpsvs I think the custom generate_database_name
and generate_schema_name
macro custom overrides are red herrings.
Here's a more simple reproduction:
silver
and gold
profiles.yml
, specify database: silver
create a test model with a config that overrides database
and schema
'-- my_model.sql
{{
config(
database = 'gold',
schema = 'new_schema'
)
}}
SELECT 1 as my_column'
dbt run -s my_model
will understand that the new_schema
schema needs to be made, but will create it in the silver
database.@prdpsvs I think we're seeing something similar to #161. @Analyticminder can you share the relevant SQL statements from your logs/dbt.log
file?
However, if I manually create the schemas for WH_Conformed_Silver
:
USE WH_Conformed_Silver; CREATE SCHEMA [common]; CREATE SCHEMA [sales];
And then do a 'dbt run' then everything runs successfully as the schemas now exist: Here is the dbt versions I'm using:
@dataders , It looks like 1st is not working as per above by @Analyticminder. @Analyticminder , please confirm.
@dataders and @prdpsvs Thanks for responding! I appreciate both of your time!
I'm saying the functionality should work for dbt to create a schema for a model if it does not exist. However, it is not working in my example with overrides to schema and database. It appears that the use of generate_database_name/generate_schema_name may be altering the code that gets run against Fabric... See below.
Below is the full log from the original post. original_problem_dbt.log It seems to find schemas:
[0m19:39:29.226772 [debug] [ThreadPool]: Using fabric connection "list_WH_Conformed_Silver"
[0m19:39:29.227772 [debug] [ThreadPool]: On list_WH_Conformed_Silver: /* {"app": "dbt", "dbt_version": "1.7.14", "profile_name": "dbt_fabric_testing", "target_name": "dev", "connection_name": "list_WH_Conformed_Silver"} */
select name as [schema]
from sys.schemas
But later in the log it doesn't add the block of T-SQL code to create the schema... It does add another schema ("test"
) for some reason for the same database (WH_Conformed_Silver
) but not for the schemas "sales"
or "common"
I also was inspired and created a very simple solution without using the generate_database_name.sql
or generate_schema_name.sql
and was successful and only using the config settings in each model to set the database and schema to other databases and schemas than what the profile.yml was set to. And the behavior in the logs is much more consistent with what I would expect.
new_successful_run.log
However, I did notice that whatever I did, it added "dbo_"
to the deployed schema. So, the "sales"
schema turned into "dbo_sales"
. I'm not concerned about this problem, just bringing it up in case it is relevant in some degree. Here is some example model code:
{{
config(
database = 'Gold',
schema = 'sales',
materialized='table'
)
}}
SELECT * FROM {{ref('my_model_1')}}
Thanks, and please let me know if you need any other information
@Analyticminder ,
Note that adapter will not change this behavior as the interpretation of schema can be different for different projects. And another reason is that dbt core/dbt adapters create schemas, not the adapters.
You need to define schema at the model level to create model if it does not exist. This is the right approach. As you eluded above, please see the default behavior - https://github.com/dbt-labs/dbt-adapters/blob/fd33aafe8276051b313a3c89557b50d224bc50ed/dbt/include/global_project/macros/get_custom_name/get_custom_schema.sql#L21
`{
config(
database = 'Gold',
schema = 'sales',
materialized='table'
)
}}
SELECT * FROM {{ref('my_model_1')}}
I am closing this issue because there is nothing much, we can do at the adapter level to change this behavior at this time. I am following with my dbt team to figure out alternatives. If anything can be done, I will re-open this issue.
Hello, it appears that schemas are not being created when dbt allows the deployment to different warehouses. For example, my
profiles.yml
is set to schemadbo
and the database isWH_Stage_Bronze
I have an override for the
generate_database_name.sql
that works successfully in deploying objects to other warehouses in the same workspace (provided the schemas are setup beforehand hence the issue I'm having)I also have an override for the schema that is as follows:
It basically parses the file or model name by double underscores. It takes the middle section of the name and uses that as the schema. When I "
dbt run
" the project, I get the error:What should happen is the files that start with
"CFM__"
should go to the warehouseWH_Conformed_Silver
and files that start with"STG__"
go toWH_Stage_Bronze
warehouse. The"CFM__common"
files should go to theWH_Conformed_Silver.common
Warehouse/schema and the"CFM__sales"
should go to theWH_Conformed_Silver.sales
Warehouse/schema. The only schemas that exist onWH_Conformed_Silver
at this point isdbo
. However, if I manually create the schemas forWH_Conformed_Silver
:And then do a 'dbt run' then everything runs successfully as the schemas now exist: Here is the dbt versions I'm using:
I am able to successfully perform schema deployment/creation to different databases in dbt-snowflake, I figured we should be able to the same for Fabric as Fabric allows for cross-database querying. For a number of reasons, it would be beneficial for dbt-fabric to create these schemas for us. Could someone please help?