Closed stevenconnorg closed 2 months ago
Hi @stevenconnorg
Thanks for the work on this one!
At first, I am a bit worried of moving to varchar(50000)
and changing all the type_string()
in those models to type_large_string()
for all users of the package.
From what I read here , I understand that Redshift does some optimization/memory allocation based on the types and I wouldn't want people who don't have any issue today with Redshift to be negatively impacted by performance following this change.
In your specific case, do you know what specific column(s) was/were creating the issue? And what varchar
size this column has?
We might then be able to target the changes required to fix the reported issues.
Hey @b-per
For some reason, dbt.type_string()
is creating character varying(1)
columns, so this is causing the majority of the issues in these models (also reported in #463)
However, I'm still running into issues with redshift__type_large_string()
in the description
column, with the max string length around ~13k characters.
So maybe we bump the redshift__type_large_string()
length to 15-20 thousand, and then figure out what's going on with the dbt.type_string()
?
Yes, I think it is better to get to the bottom of it.
Do you have jq
installed? If not, could you install it and do (I am on Mac, but we could do something similar in Windows)
cat target/manifest.json | jq '.macros | .[] | select(.name | contains("type_string"))'
and copy the results here?
My results look like
{
"name": "type_string",
"resource_type": "macro",
"package_name": "dbt",
"path": "macros/utils/data_types.sql",
"original_file_path": "macros/utils/data_types.sql",
"unique_id": "macro.dbt.type_string",
"macro_sql": "\n\n{%- macro type_string() -%}\n {{ return(adapter.dispatch('type_string', 'dbt')()) }}\n{%- endmacro -%}\n\n",
"depends_on": {
"macros": [
"macro.dbt.default__type_string"
]
},
"description": "",
"meta": {},
"docs": {
"show": true,
"node_color": null
},
"patch_path": null,
"arguments": [],
"created_at": 1719397302.694219,
"supported_languages": null
}
{
"name": "default__type_string",
"resource_type": "macro",
"package_name": "dbt",
"path": "macros/utils/data_types.sql",
"original_file_path": "macros/utils/data_types.sql",
"unique_id": "macro.dbt.default__type_string",
"macro_sql": "{% macro default__type_string() %}\n {{ return(api.Column.translate_type(\"string\")) }}\n{% endmacro %}",
"depends_on": {
"macros": []
},
"description": "",
"meta": {},
"docs": {
"show": true,
"node_color": null
},
"patch_path": null,
"arguments": [],
"created_at": 1719397302.694326,
"supported_languages": null
}
{
"name": "redshift__type_string",
"resource_type": "macro",
"package_name": "dbt_project_evaluator",
"path": "macros/cross_db_shim/redshift_shims.sql",
"original_file_path": "macros/cross_db_shim/redshift_shims.sql",
"unique_id": "macro.dbt_project_evaluator.redshift__type_string",
"macro_sql": "{%- macro redshift__type_string() -%}\n {{ \"VARCHAR(600)\" }}\n{%- endmacro %}",
"depends_on": {
"macros": []
},
"description": "",
"meta": {},
"docs": {
"show": true,
"node_color": null
},
"patch_path": null,
"arguments": [],
"created_at": 1719397302.810622,
"supported_languages": null
}
Hey @b-per , here are the results from the jq
command:
{
"name": "type_string",
"resource_type": "macro",
"package_name": "dbt",
"path": "macros/utils/data_types.sql",
"original_file_path": "macros/utils/data_types.sql",
"unique_id": "macro.dbt.type_string",
"macro_sql": "\n\n{%- macro type_string() -%}\n {{ return(adapter.dispatch('type_string', 'dbt')()) }}\n{%- endmacro -%}\n\n",
"depends_on": {
"macros": [
"macro.dbt.default__type_string"
]
},
"description": "",
"meta": {},
"docs": {
"show": true,
"node_color": null
},
"patch_path": null,
"arguments": [],
"created_at": 1719348795.668191,
"supported_languages": null
}
{
"name": "default__type_string",
"resource_type": "macro",
"package_name": "dbt",
"path": "macros/utils/data_types.sql",
"original_file_path": "macros/utils/data_types.sql",
"unique_id": "macro.dbt.default__type_string",
"macro_sql": "{% macro default__type_string() %}\n {{ return(api.Column.translate_type(\"string\")) }}\n{% endmacro %}",
"depends_on": {
"macros": []
},
"description": "",
"meta": {},
"docs": {
"show": true,
"node_color": null
},
"patch_path": null,
"arguments": [],
"created_at": 1719348795.668301,
"supported_languages": null
}
{
"name": "type_string",
"resource_type": "macro",
"package_name": "dbt_profiler",
"path": "macros/cross_db_utils.sql",
"original_file_path": "macros/cross_db_utils.sql",
"unique_id": "macro.dbt_profiler.type_string",
"macro_sql": "\n\n{%- macro type_string() -%}\n {{ return(adapter.dispatch(\"type_string\", macro_namespace=\"dbt_profiler\")()) }}\n{%- endmacro -%}\n\n",
"depends_on": {
"macros": [
"macro.dbt_profiler.default__type_string"
]
},
"description": "",
"meta": {},
"docs": {
"show": true,
"node_color": null
},
"patch_path": null,
"arguments": [],
"created_at": 1719348795.795249,
"supported_languages": null
}
{
"name": "default__type_string",
"resource_type": "macro",
"package_name": "dbt_profiler",
"path": "macros/cross_db_utils.sql",
"original_file_path": "macros/cross_db_utils.sql",
"unique_id": "macro.dbt_profiler.default__type_string",
"macro_sql": "{%- macro default__type_string() -%}\n varchar\n{%- endmacro -%}\n\n",
"depends_on": {
"macros": []
},
"description": "",
"meta": {},
"docs": {
"show": true,
"node_color": null
},
"patch_path": null,
"arguments": [],
"created_at": 1719348795.7953038,
"supported_languages": null
}
{
"name": "bigquery__type_string",
"resource_type": "macro",
"package_name": "dbt_profiler",
"path": "macros/cross_db_utils.sql",
"original_file_path": "macros/cross_db_utils.sql",
"unique_id": "macro.dbt_profiler.bigquery__type_string",
"macro_sql": "{%- macro bigquery__type_string() -%}\n string\n{%- endmacro -%}\n\n",
"depends_on": {
"macros": []
},
"description": "",
"meta": {},
"docs": {
"show": true,
"node_color": null
},
"patch_path": null,
"arguments": [],
"created_at": 1719348795.7953548,
"supported_languages": null
}
{
"name": "databricks__type_string",
"resource_type": "macro",
"package_name": "dbt_profiler",
"path": "macros/cross_db_utils.sql",
"original_file_path": "macros/cross_db_utils.sql",
"unique_id": "macro.dbt_profiler.databricks__type_string",
"macro_sql": "{%- macro databricks__type_string() -%}\n string\n{%- endmacro -%}\n\n\n",
"depends_on": {
"macros": []
},
"description": "",
"meta": {},
"docs": {
"show": true,
"node_color": null
},
"patch_path": null,
"arguments": [],
"created_at": 1719348795.795406,
"supported_languages": null
}
{
"name": "redshift__type_string",
"resource_type": "macro",
"package_name": "dbt_project_evaluator",
"path": "macros/cross_db_shim/redshift_shims.sql",
"original_file_path": "macros/cross_db_shim/redshift_shims.sql",
"unique_id": "macro.dbt_project_evaluator.redshift__type_string",
"macro_sql": "{%- macro redshift__type_string() -%}\n {{ \"VARCHAR(600)\" }}\n{%- endmacro %}",
"depends_on": {
"macros": []
},
"description": "",
"meta": {},
"docs": {
"show": true,
"node_color": null
},
"patch_path": null,
"arguments": [],
"created_at": 1719348795.9382882,
"supported_languages": null
}
{
"name": "edr_type_string",
"resource_type": "macro",
"package_name": "elementary",
"path": "macros/utils/data_types/data_type.sql",
"original_file_path": "macros/utils/data_types/data_type.sql",
"unique_id": "macro.elementary.edr_type_string",
"macro_sql": "\n\n\n{%- macro edr_type_string() -%}\n {{ return(adapter.dispatch('edr_type_string', 'elementary')()) }}\n{%- endmacro -%}\n\n",
"depends_on": {
"macros": [
"macro.elementary.postgres__edr_type_string"
]
},
"description": "",
"meta": {},
"docs": {
"show": true,
"node_color": null
},
"patch_path": null,
"arguments": [],
"created_at": 1719348796.390201,
"supported_languages": null
}
{
"name": "default__edr_type_string",
"resource_type": "macro",
"package_name": "elementary",
"path": "macros/utils/data_types/data_type.sql",
"original_file_path": "macros/utils/data_types/data_type.sql",
"unique_id": "macro.elementary.default__edr_type_string",
"macro_sql": "{% macro default__edr_type_string() %}\n {# Redshift #}\n {% do return(\"varchar(4096)\") %}\n{% endmacro %}",
"depends_on": {
"macros": []
},
"description": "",
"meta": {},
"docs": {
"show": true,
"node_color": null
},
"patch_path": null,
"arguments": [],
"created_at": 1719348796.390293,
"supported_languages": null
}
{
"name": "postgres__edr_type_string",
"resource_type": "macro",
"package_name": "elementary",
"path": "macros/utils/data_types/data_type.sql",
"original_file_path": "macros/utils/data_types/data_type.sql",
"unique_id": "macro.elementary.postgres__edr_type_string",
"macro_sql": "{% macro postgres__edr_type_string() %}\n {% if var(\"sync\", false) %}\n {% do return(\"text\") %}\n {% else %}\n {% do return(\"varchar(4096)\") %}\n {% endif %}\n{% endmacro %}",
"depends_on": {
"macros": []
},
"description": "",
"meta": {},
"docs": {
"show": true,
"node_color": null
},
"patch_path": null,
"arguments": [],
"created_at": 1719348796.390466,
"supported_languages": null
}
{
"name": "snowflake__edr_type_string",
"resource_type": "macro",
"package_name": "elementary",
"path": "macros/utils/data_types/data_type.sql",
"original_file_path": "macros/utils/data_types/data_type.sql",
"unique_id": "macro.elementary.snowflake__edr_type_string",
"macro_sql": "{% macro snowflake__edr_type_string() %}\n {# Default max varchar size in Snowflake is 16MB #}\n {% do return(\"varchar\") %}\n{% endmacro %}",
"depends_on": {
"macros": []
},
"description": "",
"meta": {},
"docs": {
"show": true,
"node_color": null
},
"patch_path": null,
"arguments": [],
"created_at": 1719348796.3905568,
"supported_languages": null
}
{
"name": "bigquery__edr_type_string",
"resource_type": "macro",
"package_name": "elementary",
"path": "macros/utils/data_types/data_type.sql",
"original_file_path": "macros/utils/data_types/data_type.sql",
"unique_id": "macro.elementary.bigquery__edr_type_string",
"macro_sql": "{% macro bigquery__edr_type_string() %}\n {# Default max string size in Bigquery is 65K #}\n {% do return(\"string\") %}\n{% endmacro %}",
"depends_on": {
"macros": []
},
"description": "",
"meta": {},
"docs": {
"show": true,
"node_color": null
},
"patch_path": null,
"arguments": [],
"created_at": 1719348796.390642,
"supported_languages": null
}
{
"name": "spark__edr_type_string",
"resource_type": "macro",
"package_name": "elementary",
"path": "macros/utils/data_types/data_type.sql",
"original_file_path": "macros/utils/data_types/data_type.sql",
"unique_id": "macro.elementary.spark__edr_type_string",
"macro_sql": "{% macro spark__edr_type_string() %}\n {% do return(\"string\") %}\n{% endmacro %}",
"depends_on": {
"macros": []
},
"description": "",
"meta": {},
"docs": {
"show": true,
"node_color": null
},
"patch_path": null,
"arguments": [],
"created_at": 1719348796.390722,
"supported_languages": null
}
{
"name": "athena__edr_type_string",
"resource_type": "macro",
"package_name": "elementary",
"path": "macros/utils/data_types/data_type.sql",
"original_file_path": "macros/utils/data_types/data_type.sql",
"unique_id": "macro.elementary.athena__edr_type_string",
"macro_sql": "{% macro athena__edr_type_string() %}\n {% do return(\"varchar\") %}\n{% endmacro %}",
"depends_on": {
"macros": []
},
"description": "",
"meta": {},
"docs": {
"show": true,
"node_color": null
},
"patch_path": null,
"arguments": [],
"created_at": 1719348796.390801,
"supported_languages": null
}
{
"name": "trino__edr_type_string",
"resource_type": "macro",
"package_name": "elementary",
"path": "macros/utils/data_types/data_type.sql",
"original_file_path": "macros/utils/data_types/data_type.sql",
"unique_id": "macro.elementary.trino__edr_type_string",
"macro_sql": "{% macro trino__edr_type_string() %}\n {% do return(\"varchar\") %}\n{% endmacro %}",
"depends_on": {
"macros": []
},
"description": "",
"meta": {},
"docs": {
"show": true,
"node_color": null
},
"patch_path": null,
"arguments": [],
"created_at": 1719348796.390879,
"supported_languages": null
}
With #475 implemented and released in the latest version, I think that this is not needed anymore. Let's open a new issue if there is still an issue.
This is a:
Link to Issue
Description & motivation
type_large_string()
inbase_source_columns
andbase_node_columns
redshift__type_large_string
lengthinsert
errorIntegration Test Screenshot
Checklist