Closed andysteinitz closed 9 months ago
Hi @andysteinitz thank you for opening this issue, and thanks for calling out the try_parse_json
solution! I was able to reproduce the issue and found a fix that was working on my end. I swapped fivetran_utils.json_parse()
for fivetran_utils.json_extract()
since fivetran_utils.json_extract uses try_json for Snowflake. We typically use the json_parse version since you can specify a path, but in this particular case we only need to go one level.
I have created a test branch--would you be able to test that it works correctly for you? You can install the test branch using the below snippet.
- git: https://github.com/fivetran/dbt_fivetran_log.git
revision: test/try-json
warn-unpinned: false
Let me know how it works for you!
Hi @fivetran-catfritz yes, this does allow the model to succeed.
Unfortunately I still can't install the new package because old log entries are causing the dependent test unique_fivetran_platform__audit_user_activity_log_id
to fail.
Our raw fivetran_platform.log data appears to contain connector_id
values in the id
field prior to April 2022 (about the same time as https://github.com/fivetran/dbt_fivetran_log/issues/46 was resolved). In cases where connector_id
was null, it looks like the log populated id
with "system".
Let me know how best to proceed with sharing those details here or elsewhere like in a fivetran support ticket.
Hi @andysteinitz thanks for testing that out! Looks like my test branch crossed paths with the latest version release, and the my branch missed a test update for that model. I updated the uniqueness test to a combination of log_id
and occurred_at
. If you run dbt deps
again and rerun the tests, does it pass for you now?
Thank you, confirming the test dbt_utils_unique_combination_of_columns_fivetran_platform__audit_user_activity_log_id__occurred_at
passes for me.
Side note fyi that the new dbt deps process with package-lock.yml doesn't seem smoothed out. I had to hard delete my local package-lock to get the update to reflect in my local repo. dbt clean
/ dbt deps
alone wasn't achieving the update, nor was dbt deps --lock
@andysteinitz Thanks for confirming and the additional information! We've also noticed hard deleting the package-lock seems to be required at times, so it's interesting to know you're experiencing this as well.
As for updating the package, we do plan to do make a fix for the issue you're seeing, however we realized we'll have to do a bit more work to find the fixes for all the warehouses we support. I will leave the test branch up if you wish to use that in the meantime and will let you know when we know more about the timing. Also I will be taking some time off starting next week, but another member of my team should be able to help you if you have any questions in the meantime!
Hi @andysteinitz I wanted to chime in as I think I may have found a more scalable solution when accounting for the behavior across warehouses. If you have availability, would you be able to attempt running the following version of the package and let us know if this results in a successful run.
packages:
- git: https://github.com/fivetran/dbt_fivetran_log.git
revision: bugfix/json-actor-filter
warn-unpinned: false
Essentially, this branch includes a change similar to the initial fix from the other ticket you linked. We are filtering out the records we do not care about. This should ensure the JSON parsing is only happening on tickets that have the proper data to be extracted.
Let us know if this works. If so, we will likely move forward with this approach.
Hello again @fivetran-joemarkiewicz
Yes, your version revision: bugfix/json-actor-filter
also completed successfully. You can proceed with either solution.
That's great to hear! Thanks for confirming. I will start the formal PR process with the change I just recently suggested. You can likely expect an official release after the holidays.
This fix has since been included in the main
branch and is live within the v1.4.1
release of the dbt_fivetran_log package. Thanks again @andysteinitz for bringing this to our attention and for helping validate the change before release.
As the issue has been addressed, I will close out this ticket. Please feel free to reopen the issue or create a new one if you are still experiencing a the same or similar issue following the upgrade.
Is there an existing issue for this?
Describe the issue
This is similar to https://github.com/fivetran/dbt_fivetran_log/issues/46 The use of fivetran_utils.json_parse in the release of https://github.com/fivetran/dbt_fivetran_log/pull/98 in v1.3.0 dbt_fivetran_log does not account for malformed json
https://github.com/fivetran/dbt_fivetran_log/blob/78c4398d9b8f55e79b88c151d600b7c64c56ceb3/models/fivetran_platform__audit_user_activity.sql#L3-L6
parse_json
results in the errorError parsing JSON: unknown keyword "skipper_breech", pos 15
. "skipper_breech" is the id of a long obsolete Transformation in our logs and it looks likemessage_data
values from before 2021-05-12 were not structured as json:using
try_parse_json
allows the query to complete successfully.Relevant error log or model output
Expected behavior
Values not stored as json are bypassed instead of triggering an error. In snowflake we can use try_json_parse
dbt Project configurations
models: fivetran_log: schema: fivetran_platform staging: schema: stg_fivetran_platform
Package versions
packages:
What database are you using dbt with?
snowflake
dbt Version
1.7
Additional Context
No response
Are you willing to open a PR to help address this issue?