Closed Sunnyinho closed 11 months ago
Hi @Sunnyinho thank you for opening this issue.
Would you be able to share the a screenshot of the results for this test failure? It looks like there is 1 result where this test is failing. I would be curious to see the record and where the duplicate is occurring which our test in not checking.
Thanks!
Hi @Sunnyinho thank you for opening this issue.
Would you be able to share the a screenshot of the results for this test failure? It looks like there is 1 result where this test is failing. I would be curious to see the record and where the duplicate is occurring which our test in not checking.
Thanks!
Hi @fivetran-joemarkiewicz thanks for replying. By result for test failure did you mean the logs of the error?
Sure thing happy to clarify!
By result of the failure I mean the actual results in the table that are triggering this test failure. You can find the sql used to run this test in the target
folder. But the code should look something like the following:
with validation_errors as (
select
connector_id, destination_id, message_data, created_at
from fivetran_platform__schema_changelog
group by connector_id, destination_id, message_data, created_at
having count(*) > 1
)
select *
from validation_errors
Would you be able to run this with the changes for your database/schema and we can inspect the record that is triggering the failure and see what the duplicates are for this scenario. You likely will need to run an additional query that doesn't perform the aggregations to inspect the duplicate records to inspect them in more detail. This second query would be the most helpful to understand what is (or is not) different between the duplicates.
Sure thing happy to clarify!
By result of the failure I mean the actual results in the table that are triggering this test failure. You can find the sql used to run this test in the
target
folder. But the code should look something like the following:with validation_errors as ( select connector_id, destination_id, message_data, created_at from fivetran_platform__schema_changelog group by connector_id, destination_id, message_data, created_at having count(*) > 1 ) select * from validation_errors
Would you be able to run this with the changes for your database/schema and we can inspect the record that is triggering the failure and see what the duplicates are for this scenario. You likely will need to run an additional query that doesn't perform the aggregations to inspect the duplicate records to inspect them in more detail. This second query would be the most helpful to understand what is (or is not) different between the duplicates.
select * from analytics_dev.fivetran_platform.fivetran_platform__schema_changelog where connector_id = 'wildfire_antacid' and created_at = '2022-04-08 19:00:14.756' ;
Found the duplicate record
Great! Would you be able to share the results as a screenshot. I am curious if there is a field (which we are not testing) that is different between the two records. Or if they are in fact true duplicates.
CONNECTOR_ID | CONNECTOR_NAME | DESTINATION_ID | DESTINATION_NAME | CREATED_AT | EVENT_SUBTYPE | MESSAGE_DATA | TABLE_NAME | SCHEMA_NAME |
---|---|---|---|---|---|---|---|---|
wildfire_antacid | citizen_sheets.performance | flavoring_epistle | snowflake_raw | 2022-04-08 19:00:14.756 | alter_table | "{\"type\":\"DROP_COLUMN\",\"table\":\"performance\",\"properties\":{\"columnName\":\"s_2_incidents-deprecated-e79ae695-63ec-490c-b993-2b7ca3129455\"}}" | "performance" | NULL |
wildfire_antacid | citizen_sheets.performance | flavoring_epistle | snowflake_raw | 2022-04-08 19:00:14.756 | alter_table | "{\"type\":\"DROP_COLUMN\",\"table\":\"performance\",\"properties\":{\"columnName\":\"s_2_incidents-deprecated-e79ae695-63ec-490c-b993-2b7ca3129455\"}}" | "performance" | NULL |
Great! Would you be able to share the results as a screenshot. I am curious if there is a field (which we are not testing) that is different between the two records. Or if they are in fact true duplicates.
I see these are true duplicate
Thanks for sharing @Sunnyinho! I would agree that these do in fact seem to be true duplicates. Since that is the case, I believe this may be a case of "test failed successfully" where there should not be duplicate records but the package test identified them as such.
I believe this duplicate is actually originating from the connector. Are you able to confirm that this connector_id has these duplicates in the raw log
table? If so, then there is not much we will be able to fix within the package. Instead I would recommend opening a support ticket for our connector team to investigate the origin of the duplicate log record.
Thanks for sharing @Sunnyinho! I would agree that these do in fact seem to be true duplicates. Since that is the case, I believe this may be a case of "test failed successfully" where there should not be duplicate records but the package test identified them as such.
I believe this duplicate is actually originating from the connector. Are you able to confirm that this connector_id has these duplicates in the raw
log
table? If so, then there is not much we will be able to fix within the package. Instead I would recommend opening a support ticket for our connector team to investigate the origin of the duplicate log record.
I found the duplicate error in the raw log
table(RAW.FIVETRAN_LOGS.LOG
), since the package uses the data from fivetran_logs
schema will the issue be solved if I delete one of those duplicate records from the table?
ID | CONNECTOR_ID | EVENT | MESSAGE_EVENT | MESSAGE_DATA | SYNC_ID | TIME_STAMP | _FIVETRAN_SYNCED |
---|---|---|---|---|---|---|---|
ImWd0lRHnOsujN+G5PdtQqg6adU= | wildfire_antacid | INFO | alter_table | "{\"type\":\"DROP_COLUMN\",\"table\":\"performance\",\"properties\":{\"columnName\":\"s_2_incidents-deprecated-e79ae695-63ec-490c-b993-2b7ca3129455\"}}" | 74795cbf-6de7-4cc5-986e-4186fad3fa88 | 2022-04-08 19:00:14.756 +0000 | 2022-04-12 07:28:55.376 +0000 |
gDbpKzEb7HnYVMf1gt/V5JpzHgc= | wildfire_antacid | INFO | alter_table | "{\"type\":\"DROP_COLUMN\",\"table\":\"performance\",\"properties\":{\"columnName\":\"s_2_incidents-deprecated-e79ae695-63ec-490c-b993-2b7ca3129455\"}}" | 74795cbf-6de7-4cc5-986e-4186fad3fa88 | 2022-04-08 19:00:14.756 +0000 | 2022-04-12 07:28:55.376 +0000 |
Thanks for investigating the raw @Sunnyinho! Correct, removing the duplicate from the raw will resolve this test failure. I would still consider opening a support ticket to understand why exactly this duplicate was introduced. Just in case this occurs in the future and you have a better understanding as to why this happened.
In the meantime, I will mark this ticket as done and won't fix since this was something that was introduced via the connector and identified via the package. Thanks for raising this and glad we were able to identify the root of the test failure!
Thanks for investigating the raw @Sunnyinho! Correct, removing the duplicate from the raw will resolve this test failure. I would still consider opening a support ticket to understand why exactly this duplicate was introduced. Just in case this occurs in the future and you have a better understanding as to why this happened.
In the meantime, I will mark this ticket as done and won't fix since this was something that was introduced via the connector and identified via the package. Thanks for raising this and glad we were able to identify the root of the test failure!
Thanks for helping @fivetran-joemarkiewicz. I have already created a support ticket
Is there an existing issue for this?
Describe the issue
There is a test failure in one of the models. The model name is
fivetran_platform__schema_changelog
. The test error log isThe package version used is
With upgraded version
1.1.0
the error still persists.Steps to reproduce error
$ dbt deps
$ dbt run -s fivetran_logs
# this run all the models and successfully runs everything$ dbt test -s fivetran_logs
# this runs all the tests related to the models The failure will be reproduced at this point.Relevant error log or model output
Expected behavior
The expectation is to successfully pass the test without any error
dbt Project configurations
Package versions
What database are you using dbt with?
snowflake
dbt Version
Additional Context
No response
Are you willing to open a PR to help address this issue?