Closed saraleon1 closed 1 year ago
@saraleon1 Thanks for opening!
If the only error had been defining the test as not null
instead of not_null
, we would have raised a more helpful error message (still not amazing, but at least better):
Compilation Error in test not_model_a_id (models/schema.yml)
'test_not' is undefined. This can happen when calling a macro that does not exist. Check for typos and/or install package dependencies with "dbt deps".
The misspelled desciption
and colums
are more insidious. It's pretty surprising that this doesn't raise a validation error:
version: 2
models:
- name: model_a
random_key: does nothing
@gshank I know we do basic validation for the models
block when reading files, to make sure we're not setting parsing up for horrible failure later on. Do you have a sense of where we could validate to make sure that the keys in a node update block match our expected/supported keys? It's not like random_key
is doing anything above; it doesn't appear as a node property in the manifest.
I haven't looked into this code extensively, but my sense is that we do one of the following:
_extra
dict, raise a parsing error somewhere here? if there are unrecognized top-level keys:
if block.target._extra:
raise ParsingException(f"Extra keys not allowed! {block.target._extra}")
$ dbt --no-partial-parse test
16:35:29 | [ info ] | Running with dbt=1.0.0-rc1
16:35:29 | [ error ] | Encountered an error:
Parsing Error
Extra keys not allowed! {'random_key': 'does nothing'}
I do think this could be a potential breaking change, in the sense of raising an error for existing projects that have cruft or misspelled properties hanging around. This feels like a great thing to sneak in for v1.0.0 if it would be simple, or a later minor version if it would be a much larger lift.
There's some minimal validation done in 'get_key_dicts', but we don't actually validate that there are only valid keys in the dictionaries anywhere; we basically set the values into the attributes when constructing objects, so we just ignore invalid keys. Better validation seems like a good idea. I'm not exactly sure what the best place and way to do it is. If we validate when everything is a dictionary, then we have to update the valid keys when a config or attribute is added, and it would have to be specific to the patch type. But validating as an object isn't going to work because we can't set attributes that don't exist. I think the best approach would be to have validation code and separate validation data for each entry type, and validate somewhere around the get_key_dicts, but with separate validation methods for each schema subparser.
Hey team, I've got a relevant one for you... if you have a malformed dbt_project.yml
:
name: "my_dbt_project"
version: "1.0.0"
config-version: 2
profile: "snowflake"
model-paths: ["models"]
macro-paths: ["macros"]
snapshot-paths: ["snapshots"]
target-path: "target"
clean-targets:
- "target"
- "dbt_modules"
- "dbt_packages"
# prettier-ignore
models:
+materialized:table
my_dbt_project:
A run in dbt CLI provides some helpful hints:
$ dbt run
10:00:23 Encountered an error:
Runtime Error
Syntax error near line 20
------------------------------
17 | # prettier-ignore
18 | models:
19 | +materialized:table
20 | my_dbt_project:
Raw Error:
------------------------------
mapping values are not allowed in this context
in "<unicode string>", line 20, column 19
But because no dbt logs are generated from the Runtime Error - we get this when looking at a run history in dbt Cloud:
You have no idea what's going on.
I suppose my questions are:
Thanks :)
@jeremyyeo I think the resolution for that discrepancy would be the proposal in #4836 — i.e. every exception that dbt-core
throws should always also be a log message.
(@nathaniel-may Does that sound right to you, or am I misunderstanding that proposal?)
@jtcohen6 Yes, I imagine we would want to log errors before they are thrown whenever possible. I'm imagining having a single exception event type that wraps the parent exception to achieve this with minimal code.
estimation
I would also like to add on the Key Validation front, that optional keys like "description" quietly fail completely if they are misspelled. No error or warning, just no results in the docs.
I spent about 20 minutes trying to see that I was missing the i in tion (descripton) and that's why it wasn't appearing in the docs, but running just fine. Maybe a warning for keys that are optional but not recognized?
This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.
I was about to remove the stale
label and add paper_cut
, but I see that we already have a newer "paper cut" issue that's basically the same as this one: https://github.com/dbt-labs/dbt-core/issues/5605
I'm going to close this older one as a duplicate of that newer one. This is still very much a UX improvement that's on my mind!
Is there an existing issue for this?
Current Behavior
A PR jobs started failing with no clear error message available to the end user. The only error surfaced was:
upon inspection of debug logs, the following log line was identified:
and upon inspection of more robust logging available to dbt cloud support, this stack trace was identified:
From here, we compared git changes between this PR job and the previously successful PR job, and were able to identify 2 new sources, 3 new models, 3 new tests.
From inspection of these changes, it was identified that the source of the issue was misspellings in the new schema.yml files as follows:
Which are misspelled invocations of a built in test, and 2 common yml keys. dbt Core should be able to surface misspellings like this more obviously to the end user
Expected Behavior
Helpful error messaging is surfaced to the end user when there are misspelling keys in the yml file
Steps To Reproduce
Misspell a key in a schema.yml file such as:
save changes and do a 'dbt run'
observe the opaque error messaging available
Relevant log output
No response
Environment
What database are you using dbt with?
bigquery
Additional Context
No response