dbt-labs / dbt-rpc

A server that can compile and run queries in the context of a dbt project. Additionally, it provides methods that can be used to list and terminate running processes.
https://docs.getdbt.com/reference/commands/rpc
Apache License 2.0
19 stars 7 forks source link

Trying to 'preview' results on a model that has the 'enabled=false' config results in unhelpful error #79

Closed saraleon1 closed 2 years ago

saraleon1 commented 2 years ago

Current Behaviour:

If you try to 'preview' a model that has the 'enabled=false' configuration, you get the following unhelpful error:

Server error: Runtime Error
got an update_rpc call with an unrecognized rpc: rpc.ncaa_basketball.request

Loom of behaviour Expected Behaviour:

A more obvious message results from the rpc error as to why the model can't be previewed.

Steps to Reproduce:

  1. add the following configuration to an existing model in dbt Cloud:
    {{
    config(
      enabled=false
    )
    }}
  2. Click the 'preview' button
  3. Observe the error message

System Information:

jtcohen6 commented 2 years ago

Sounds a lot like: https://github.com/dbt-labs/dbt-core/issues/4126

I think this is the right repo for it, though, since this touches code that's unique to dbt-rpc.

ChenyuLInx commented 2 years ago

@jtcohen6 the solution you mentioned in dbt-labs/dbt-core#4126 sounds perfect to me.

# In summary
Trying to preview/compile an explicitly disabled resource in dbt Cloud is going to raise a surprising error. I think it should raise an error, but it should be one that's much clearer about what's going on, either by:

- checking to see if the arbitrary SQL bundle maps to a disabled node much sooner, as soon as the Jinja is rendered (?)
- including the word "disabled" in the error message, so at least it offers a hint to the user about what's going on

Do you know what's the earliest point we can know that a node is disabled on the execution side? I guess that's when parsing is done right? Can you link the location in the codebase? Then I think we can just add the extra check on the RPC side right after that to throw a clear error message without doing any of the compilation.

jtcohen6 commented 2 years ago

@ChenyuLInx tl;dr This error is being raised in this step:

https://github.com/dbt-labs/dbt-rpc/blob/b7030cf77495c66c24eb63a85f006f07a6e3ec57/dbt_rpc/rpc/node_runners.py#L43-L45

And we could add a naive check to make sure the node is actually enabled, otherwise raise a more useful error message:

    def compile(self, manifest):
        if not self.node.config.enabled:
            raise dbt.exceptions.raise_compiler_error(
                f'Trying to compile a node that is disabled'
            )
        compiler = self.adapter.get_compiler()
        return compiler.compile_node(self.node, manifest, {}, write=False)

Reproduction case

curl -X POST http://localhost:8580/jsonrpc \
   -H 'Content-Type: application/json' \
   -d '{
    "jsonrpc": "2.0",
    "method": "run_sql",
    "id": "2db9a2fe-9a39-41ef-828c-25e04dd6b07d",
    "params": {
        "timeout": 60,
        "sql": "e3sKICBjb25maWcoCiAgICAgIGVuYWJsZWQ9ZmFsc2UKICApCn19CgpzZWxlY3QgMSBhcyBpZA==",
        "name": "my_first_query"
    }
}'

Submit + poll for result. Before:

{
  "error": {
    "code": 10001,
    "message": "Runtime error",
    "data": {
      "type": "RuntimeException",
      "message": "Runtime Error in rpc my_first_query (from remote system)\n  got an update_rpc call with an unrecognized rpc: rpc.testy.my_first_query",
      "raw_sql": "{{\n  config(\n      enabled=false\n  )\n}}\n\nselect 1 as id",
      "compiled_sql": null,
...
}

After:

{
  "error": {
    "code": 10004,
    "message": "Compilation Error",
    "data": {
      "type": "CompilationException",
      "message": "Compilation Error in rpc my_first_query (from remote system)\n  Trying to compile a node that is disabled",
      "raw_sql": "{{\n  config(\n      enabled=false\n  )\n}}\n\nselect 1 as id",
      "compiled_sql": null,
...
}

Background

The dbt-rpc server does something funky where, after a project's manifest has been fully parsed, a new arbitrary snippet of SQL can be "stuck" into the manifest, and compiled using its other context. Why?

This enables users to do things like previewing this SQL, which actually (behind the scenes) requires registering my_totally_new_macro as a new node in the already-parsed manifest at runtime:

{% macro my_totally_new_macro() %}
...
{% endmacro %}

select {{ my_totally_new_macro() }}

It also supports selecting from an ephemeral model, which actually (behind the scenes) requires interpolating that ephemeral model's SQL as a CTE into the compiled query:

select * from {{ ref('ephemeral_model') }} 

The ephemeral model use case is really what clinches this. In order to support that dynamic CTE interpolation, the RPC server hooks into dbt's compilation logic. That's actually an execution-time thing (gross); it directly manipulates the manifest created by parsing (extra gross).

The error here says got an update_rpc call with an unrecognized rpc: rpc.project_name.request. It's being raised via:

That final error is raised even though the name of this SQL-to-be-compiled is just the generic request. That's true for all the RPC server's arbitrary-bundle-of-SQL requests—it doesn't actually map to the real underlying model, which causes us other headaches (#46). That's actually not the cause of our problem here, though; the config(enabled=False) still takes effect, whatever the name of the bundle-of-SQL node.

ChenyuLInx commented 2 years ago

The background part is super helpful!