Unhelpful error messages (which contain Python tracebacks)

Hi,

I'm finally at a point with my program where I can send queries to EdgeDB, and I noticed that error messages are pretty unhelpful.

I'm using the eyre crate (similar to anyhow), and this is the current error I get:

Error: Error while storing the data in EdgeDB.

Caused by:
    EdgeQLSyntaxError: Unexpected ')'

Location:
    src/api/sync.rs:563:5

For some reason, the line and column number within the EdgeDB query source code isn't mentioned, which is pretty unhelpful (especially, because EdgeDB queries can be pretty big, because of the great compossibility of EdgeDB's query language).

If I change my main function to return Result<(), edgedb_tokio::Error>, the displayed error is as follows:

Error: Error(Inner { code: 67174656, messages: ["Unexpected ')'"], error: None, headers: {257: b"Traceback (most recent call last):\n File \"/home/user/.local/share/edgedb/portable/2.1/lib/python3.10/site-packages/edb/common/parsing.py\", line 420, in parse\n self.parser.token(token)\n File \"/home/user/.local/share/edgedb/portable/2.1/lib/python3.10/site-packages/parsing/lrparser.py\", line 57, in token\n self._act(token, tokenSpec) # type: ignore\n File \"/home/user/.local/share/edgedb/portable/2.1/lib/python3.10/site-packages/parsing/lrparser.py\", line 81, in _act\n raise UnexpectedToken(\"Unexpected token: %r\" % sym)\nparsing.errors.UnexpectedToken: Unexpected token: <Token RPAREN \")\">\n\nThe above exception was the direct cause of the following exception:\n\nTraceback (most recent call last):\n File \"/home/user/.local/share/edgedb/portable/2.1/lib/python3.10/site-packages/edb/server/compiler/compiler.py\", line 1738, in _compile\n return self._try_compile(ctx=ctx, source=source)\n File \"/home/user/.local/share/edgedb/portable/2.1/lib/python3.10/site-packages/edb/server/compiler/compiler.py\", line 1764, in _try_compile\n statements = edgeql.parse_block(source)\n File \"/home/user/.local/share/edgedb/portable/2.1/lib/python3.10/site-packages/edb/edgeql/parser/__init__.py\", line 77, in parse_block\n return parser.parse(source)\n File \"/home/user/.local/share/edgedb/portable/2.1/lib/python3.10/site-packages/edb/common/parsing.py\", line 433, in parse\n raise self.get_exception(\nedb.errors.EdgeQLSyntaxError: Unexpected ')'\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n File \"/home/user/.local/share/edgedb/portable/2.1/lib/python3.10/site-packages/edb/common/parsing.py\", line 420, in parse\n self.parser.token(token)\n File \"/home/user/.local/share/edgedb/portable/2.1/lib/python3.10/site-packages/parsing/lrparser.py\", line 57, in token\n self._act(token, tokenSpec) # type: ignore\n File \"/home/user/.local/share/edgedb/portable/2.1/lib/python3.10/site-packages/parsing/lrparser.py\", line 81, in _act\n raise UnexpectedToken(\"Unexpected token: %r\" % sym)\nparsing.errors.UnexpectedToken: Unexpected token: <Token RPAREN \")\">\n\nThe above exception was the direct cause of the following exception:\n\nTraceback (most recent call last):\n File \"/home/user/.local/share/edgedb/portable/2.1/lib/python3.10/site-packages/edb/server/compiler_pool/worker_proc.py\", line 55, in worker\n res = meth(*args)\n File \"/home/user/.local/share/edgedb/portable/2.1/lib/python3.10/site-packages/edb/server/compiler_pool/worker.py\", line 160, in compile\n units, cstate = COMPILER.compile(\n File \"/home/user/.local/share/edgedb/portable/2.1/lib/python3.10/site-packages/edb/server/compiler/compiler.py\", line 2207, in compile\n unit_group = self._compile(ctx=ctx, source=source)\n File \"/home/user/.local/share/edgedb/portable/2.1/lib/python3.10/site-packages/edb/server/compiler/compiler.py\", line 1747, in _compile\n raise denormalized_err\n File \"/home/user/.local/share/edgedb/portable/2.1/lib/python3.10/site-packages/edb/server/compiler/compiler.py\", line 1745, in _compile\n self._try_compile(ctx=ctx, source=original)\n File \"/home/user/.local/share/edgedb/portable/2.1/lib/python3.10/site-packages/edb/server/compiler/compiler.py\", line 1764, in _try_compile\n statements = edgeql.parse_block(source)\n File \"/home/user/.local/share/edgedb/portable/2.1/lib/python3.10/site-packages/edb/edgeql/parser/__init__.py\", line 77, in parse_block\n return parser.parse(source)\n File \"/home/user/.local/share/edgedb/portable/2.1/lib/python3.10/site-packages/edb/common/parsing.py\", line 433, in parse\n raise self.get_exception(\nedb.errors.EdgeQLSyntaxError: Unexpected ')'\n", 65530: b"1045", 65524: b"17", 65527: b"18", 65528: b"17", 65529: b"1044", 65525: b"16", 65522: b"1045", 65526: b"33", 65521: b"1044", 65523: b"33"} })

...which isn't great either.

If I unwrap the Result that contains the error, the displayed error is similar unhelpful.

Unfortunately, edgedb_errors::Error isn't documented (and seems more complicated than it should need to be), so I ended up adding the following code to figure out what is going on:

        .map_err(|e| {
            for (k, v) in e.headers() {
                let v = std::str::from_utf8(v.as_ref()).unwrap();
                println!("{k} -> {v}")
            }
            println!("kind_name => {}", e.kind_name());
            println!("kind_debug => {}", e.kind_debug());
            println!("initial_message => {:?}", e.initial_message());
            println!("hint => {:?}", e.hint());
            println!("details => {:?}", e.details());
            println!("position_start => {:?}", e.position_start());
            println!("position_end => {:?}", e.position_end());
            println!("line => {:?}", e.line());
            println!("column => {:?}", e.column());
            println!("code => {}", e.code());
            println!("server_traceback => {:?}", e.server_traceback());
            println!(
                "contexts => {:#?}",
                e.contexts().map(|x| format!("{x:?}")).collect::<Vec<_>>()
            );
            println!(
                "chain => {:#?}",
                e.chain().map(|x| format!("{x:?}")).collect::<Vec<_>>()
            );

            e
        })?;

Of that code, the following lines were useful:

            println!("line => {:?}", e.line());
            println!("column => {:?}", e.column());

It would be great, if the line and column number could be included with displayed errors.

In the meantime (in case someone else who needs this right now is stumbling upon this issue), I've created a small macro that adds the location of errors:

macro_rules! add_err_loc {
    () => {
        |err| {
            let line = line!();
            let column = column!();
            let msg = if let (Some(line2), Some(column2)) = (err.line(), err.column()) {
                format!("EdgeDB query error – at {line2}:{column2} of the query, and at {line}:{column} of the Rust src code.")
            } else {
                format!("EdgeDB query error – at {line}:{column} of the Rust src code.")
            };
            eyre::Report::new(err).wrap_err(msg)
        }
    };
}

...which can be used like this:

db.query::<String, _>("SELECT 'hello'", &())
    .await
    .map_err(add_err_loc!())?;

This also adds the Rust source code location, which seems to get lost when transactions are used, which is the reason why I've implemented this as a macro.

It's worrying me a bit that there is a Python traceback within the error (that it is there at all, and that the query parsing code fails with a traceback).

Are there any plans to replace that Python code with Rust?

Performance should be good enough with caching, but I'm pretty worried about correctness, which is why I think Rust would be a far better option for such code (code, which seems to handle potential user-supplied input, and therefore could lead to security vulnerabilities).

Rust also has fantastic libraries for displaying errors in source code, that would make it relatively easy to display errors similar to the Rust compiler:

Rust most likely also has better tools for type checking (for example RustTyC, which seems to make it easy to implement a Hindney-Milner-like type system)

It's worrying me a bit that there is a Python traceback within the error (that it is there at all, and that the query parsing code fails with a traceback).

I think Python tracebacks should be only available in dev mode of the server. If not, consider reporting error to the EdgeDB Server repository.

Are there any plans to replace that Python code with Rust?

Yes. Although, you should know that they come from server. No Python interpreter is running in the client.

Rust also has fantastic libraries for displaying errors in source code

Yes. We are very much aware. We use codespan-reporting currently in the CLI. So you can look at the CLI code to learn how to format your errors (although, you have figured out most of that).

I think we will probably migrate CLI to the miette, as it's a bit more convenient and powerful. But for the bindings there are a couple of issues:

Query source code's lifetime should extend formatting code, while we only borrow it for the duration of the method call, while error formatting usually comes later. We have a number of options to solve this, each with their own trade offs.
We probably have to pick a single error formatting library, and that will be unfortunate for the users of the other libraries. Or we might need to support multiple libraries via feature flags.
miette (current favorite) is a bit unstable, releasing a new major version every 3-6 months. Which is quite inconvenient if you need anything besides the default formatting (i.e. if you need to use miette's traits), and also every major update of miette needs major release of edgedb's bindings.

So we are exploring this design space, but there is no obviously great solutions. Your input might be valuable on some of the issues above.

(To make it clear: we don't thing formatting errors with code snippets is the task for the server, so rewriting (parts of) the server in Rust is of no help here)

Thanks for your response, @tailhook!

Sorry for the long message. I spend more than one hour on it, and I believe what I wrote is relevant.

Are there any plans to replace that Python code with Rust?

Yes.

Awesome!

Is there any estimate, when this probably will be ready? Like in "in 6 months / 1 year / 2 years", etc.?

(To make it clear: we don't thing formatting errors with code snippets is the task for the server, so rewriting (parts of) the server in Rust is of no help here)

You are right.

I'd implement a "base client" in Rust (which handles networking, protocol, validation, error formatting, etc.), and then implement clients for other languages on top of that. That's what (the just released) SurrealDB does (which is written in Rust).

Query source code's lifetime should extend formatting code, while we only borrow it for the duration of the method call, while error formatting usually comes later. We have a number of options to solve this, each with their own trade offs.

I'd just clone the source code on errors. Errors should not occur so often that this is likely to be a problem. And in libraries, nice error formatting should anyway be behind a feature flag. Ideally, this could be activated in Debug mode-only.

We probably have to pick a single error formatting library, and that will be unfortunate for the users of the other libraries. Or we might need to support multiple libraries via feature flags.

You are right. If the source code part of the error is formatted on the client, I'd just implement it via your preference (i.e. miette) behind a feature flag, and make it easy for users to use other options.

The most important thing is, that all important information is always visible when errors are printed to the terminal (i.e. line number, column number, etc.). Nicely formatted error messages are not really an absolute requirement (especially, in a library).

miette (current favorite) is a bit unstable, releasing a new major version every 3-6 months. Which is quite inconvenient if you need anything besides the default formatting (i.e. if you need to use miette's traits), and also every major update of miette needs major release of edgedb's bindings.

If error formatting is behind a feature flag, users who want more (or use a different version), could keep the feature flag disabled, and implement everything themselves, right?

After opening this issue, I have realized that errors returned by EdgeDB in general are often pretty bad (feels like 50% of all errors are terrible. To be fair, the diet I just started might have influenced my perception a bit... ;)).

For example, this:

edgedb query 'with foo := <uuid>$0, bar := <str>$1 select (foo, bar)'

Results in DescriptorMismatch: query arguments expected, which is easily understandable.

But if you provide $0, but forget $1, you'll get DescriptorMismatch: expected 1 fields, got 2, which is just terrible.

(Btw. a way to provide query arguments seems to be missing from edgedb query. This would be especially useful for testing and debugging queries that are stored in files).

Even with my (updated) macro, we only get the following information:

Error: Error while storing the data in EdgeDB.

Caused by:
   0: UserError
   1: EdgeDB query error – at 763:22:src/api/sync.rs
        Kind name: DescriptorMismatch
        Initial message: expected 1 fields, got 2
        Code: 4278386176
        Contexts:
        Chain: Error(
          Inner {
              code: 4278386176,
              messages: [
                  "expected 1 fields, got 2",
              ],
              error: None,
              headers: {},
          },
      )

   2: DescriptorMismatch: expected 1 fields, got 2

Location:
    src/api/sync.rs:563:5

Obviously, the error is easy enough to interpret with the above demo query. But if your query is 100 lines long – and composition is one of the major strengths of EdgeDB – then it becomes much less clear what the problem is.

A similar error, btw., is returned when there are too many query arguments: DescriptorMismatch: expected 5 fields, got 2.

My question is: Is that a known issue?

I did a quick search, and I found a few issues in regard to specific error messages, not in regard to generally improving the quality of error messages.

I'm asking this, because I'm sure that you are losing users because of these bad error messages.

For example, my experience on the first day of writing/debugging queries was so bad, that I seriously thought about switching to a different database (after investing a lot of time learning EdgeDB). I only didn't do it, because I don't know any usable alternative to EdgeDB (for what I intend to do, EdgeDB seems to be the best option).

Although, you should know that they come from server. No Python interpreter is running in the client.

Yes, that was clear. A Python interpreter embedded into the client would have been a bit too much for me. Even Python in the server made me straight out reject EdgeDB, at first... ;)

I think Python tracebacks should be only available in dev mode of the server. If not, consider reporting error to the EdgeDB Server repository.

I'm using an EdgeDB project setup. That would activate the dev mode, right?

I'd expect Python tracebacks only in a dev mode that is for developing EdgeDB itself, not in a dev mode for EdgeDB end users (or at least, they should not be printed to the terminal, by default).

I'm probably not the only one on whom this has a negative effect.

Besides that, I was astonished that there was a traceback at all (i.e. that my query failed with a traceback), instead of being handles gracefully.

However, later I realized/remembered, that I haven't used Python significantly for at least 10 years, and that error handling is pretty different in Python and Rust (which I used mostly, for the past three years).

So tracebacks (in software that is intended for production) might not be as bad as I initially feared (but I'm not 100% sure).

I also realized, that not every error contains a traceback.

Overall, thanks for the feedback! I agree on most of the sentiments here.

Couple of specific comments:

I'd implement a "base client" in Rust (which handles networking, protocol, validation, error formatting, etc.), and then implement clients for other languages on top of that. That's what (the just released) SurrealDB does (which is written in Rust).

Unfortunately, Rust is not as ubiquitous as C compiler. So this will make most bindings harder to install. Also implementing protocol in Rust doesn't save you from converting binary types to something more friendly in the language (i.e. Rust-encoded datetimes to native datetimes). So this approach doesn't save significant effort, and make it harder to install for users.

If error formatting is behind a feature flag, users who want more (or use a different version), could keep the feature flag disabled, and implement everything themselves, right?

Yes, probably. But we have to make sure it's easy enough, so that they don't have to wrap every trait with every query method (i.e. we clone query text and place it in the query).

I did a quick search, and I found a few issues in regard to specific error messages, not in regard to generally improving the quality of error messages.

Yes, I don't think we have some strategy for that. Please open issues on specific error messages in EdgeDB repo. We might be able to generalize some of them and think about strategy based on specific examples.

I'm using an EdgeDB project setup. That would activate the dev mode, right?

No it shouldn't. Dev mode usually works only when you run from git checkout of the source code.

Please open issues on specific error messages in EdgeDB repo. We might be able to generalize some of them and think about strategy based on specific examples.

Alright, I will just do that then 👍

No it shouldn't. Dev mode usually works only when you run from git checkout of the source code.

I've installed EdgeDB via the CLI – so there should never be a Python traceback anywhere, correct?

Okay, then I'll open an issue as soon as I get such an error again. Unfortunately, I don't remember what exactly triggered that particular error.

Done some minimal approach to error formatting in #174. Please take a look if you wish.

Thanks, @tailhook, looks good!

This doesn't add any information (EdgeQL source code line number, etc.) to errors if miette isn't used, right?

Fancy error messages are nice, but my main concern is the missing essential information from the default representations of errors.

I tried to test #174, but after several hours I'm giving up for now. Even after switching my application to miette, I couldn't get it to work.

Working with transactions and the required error type is really a major pain point for me. I spent already many hours trying to get stuff to work with that part of the client... ;)

After switching from eyre to miette the following:

    let transaction = |tx| {
        let data = data.clone();
        async move {
            // `store_data_transaction` returns `miette::Result<()>`
            store_data_transaction(tx, data).await?;
            Ok(())
        }
    };

...suddenly fails with the trait boundErrReport: AsRef<(dyn StdError + std::marker::Send + std::marker::Sync + 'static)>is not satisfied.

This seems to be the last issue to get fancy error messages working, but I can't spend more time on it, at the moment. For now, I'll just continue to use my small, ugly macro (which works well enough).

But it would be really great, if the 'transaction' workflow somehow could be improved.

For example, I don't get why transaction doesn't require a closure that returns an error type that has a type parameter, which represents the user error (e.g. edgedb_tokio::TransactionError<UserError> with one variant for edgedb_tokio::Error and one variant for UserError). Maybe I'm missing something, but that seems more user-friendly to me than how the error type looks now.

That the whole transaction closure might get executed repeatably (instead of just retrying the failed database actions) also looks pretty odd to me. This makes some use cases quite unergonomic to implement.

For example, I'm synchronizing data from a remote API with an (EdgeDB) database. If there is any error, I want to discard the whole update in progress. With the current implementation of transactions, I have to retrieve all data from the API before I can start the EdgeDB transaction. If the API data is too big, I have to write it to the hard drive, etc.

Thanks, @tailhook, looks good!

Fancy error messages are nice, but my main concern is the missing essential information from the default representations of errors.

Added to the PR. Although, it's not very convenient

I tried to test #174, but after several hours I'm giving up for now. Even after switching my application to miette, I couldn't get it to work.

Working with transactions and the required error type is really a major pain point for me. I spent already many hours trying to get stuff to work with that part of the client... ;)

After switching from eyre to miette the following:
    let transaction = |tx| {
        let data = data.clone();
        async move {
            // `store_data_transaction` returns `miette::Result<()>`
            store_data_transaction(tx, data).await?;
            Ok(())
        }
    };
...suddenly fails with the trait boundErrReport: AsRef<(dyn StdError + std::marker::Send + std::marker::Sync + 'static)>is not satisfied.

I think everything returning miette/anyhow/eyre error should be wrapped into .map_err(UserError::with_source_ref). But this has to be done inside the store_data_transaction function and applied to external errors only. Edgedb errors must be passed as is (not boxed inside eyre/miette/Box containers). Otherwise retries will not work (i.e. transaction code will treat those errors as opaque user errors).

But it would be really great, if the 'transaction' workflow somehow could be improved.

Yeah. I did not pay attention to the transactions in this iteration. I'll try to play with it a little.

For example, I don't get why transaction doesn't require a closure that returns an error type that has a type parameter, which represents the user error (e.g. edgedb_tokio::TransactionError<UserError> with one variant for edgedb_tokio::Error and one variant for UserError). Maybe I'm missing something, but that seems more user-friendly to me than how the error type looks now.

The problem with it is that we can't make question-mark ? operator work on both user errors and edgedb errors equally well (see this playground ). So you'll have to do .map_err(..) for one or another. And that is no different from using .map_err(UserError::with_source...).

That the whole transaction closure might get executed repeatably (instead of just retrying the failed database actions) also looks pretty odd to me. This makes some use cases quite unergonomic to implement.

For example, I'm synchronizing data from a remote API with an (EdgeDB) database. If there is any error, I want to discard the whole update in progress. With the current implementation of transactions, I have to retrieve all data from the API before I can start the EdgeDB transaction. If the API data is too big, I have to write it to the hard drive, etc.

There are a couple of design considerations:

When transaction retries there might be some different data in the database. Say you do SELECT User.money and then check if there is enough money and do the actual withdraw. If we just retry actions, the new value of money will never be checked.
Big transactions should be discouraged. If your API data is too long and doesn't fit memory, it's also probably too huge to fit into a transaction too. Transactions aren't memory-only, but they are blocking all the data they touch from concurrent access (simplified explanation). Also if we'd want to "retry actions" we would likely have to keep that data in-memory for retry anyways.
Fetching external APIs inside the transaction is discouraged for the same reason: concurrent access to the transaction data might be a problem during that API access if it's slow.
On the other hand, if you absolutely have to access external API during the transaction, it may exhibit the same behavior as in (1), so it has to be fetched again otherwise code becomes incorrect.

So generally it's advised to split transactions into several if external (potentially slow) API has to be used.

If there is any error, I want to discard the whole update in progress

That said, you can turn off retries, if this what you meant.

Thanks for the feedback! I'll try to take a look on errors in transactions and get a follow up.

I think everything returning miette/anyhow/eyre error should be wrapped into .map_err(UserError::with_source_ref). But this has to be done inside the store_data_transaction function and applied to external errors only. Edgedb errors must be passed as is (not boxed inside eyre/miette/Box containers). Otherwise retries will not work (i.e. transaction code will treat those errors as opaque user errors).

Ah, I tried .map_err(UserError::with_source_ref), but outside of store_data_transaction (wrapping miette::Report). With eyre and my macro I just returned Result<(), eyre::Report> from store_data_transaction, which worked via ? (I realize that this isn't ideal, but I don't really care about retries).

I changed store_data_transaction to returning edgedb_tokio::Error, which fixes the issue.

Added to the PR. Although, it's not very convenient

Thanks!

Why do you think, this isn't very convenient? This definitely seems more convenient than the previous state.

Btw., it seems you added line/column number and .hint() – is that the only important information? Because there are other methods, like .details(), and so on.

Yeah. I did not pay attention to the transactions in this iteration. I'll try to play with it a little.

Sounds great!

Maybe it would make sense, to add RAII guard-based transactions, in addition to the current closure-base option (similar to what sqlx does)?

For example, wouldn't something like the following work?

let tx = db.begin_transaction().await?;

loop {
    match do_stuff(&tx).await {
        Ok(r) => break r,
        Err(e) if tx.failed() => return Err(e),
        Err(_) => continue,
    }
}

If I'm not missing something, I definitely would prefer such an option. Closures often make things more difficult in Rust. This makes it also pretty clear, that the transaction code is repeated (without reading the documentation of edgedb_tokio::Client::transaction).

The problem with it is that we can't make question-mark ? operator work on both user errors and edgedb errors equally well (see this playground ). So you'll have to do .map_err(..) for one or another. And that is no different from using .map_err(UserError::with_source...).

Yes, in that regard, what I propose wouldn't be better or worse.

The advantages would be, however:

The user error can be defined via a type parameter
So no downcasting is required – or whatever has to be done to get a UserError out of a edgedb_tokio::Error – even after looking at the documentation I don't know how to do that
The documentation is easier to understand. Do you remember, that I have asked if it is even possible to return user defined errors from transactions? Please go to Client::transaction, and see what a new user has to do to figure out if that is possible. Even after clicking on Error there is no indicator that this is possible. And then, even if you figure out that the ErrorKind trait might have something to do with user defined errors, you have to scroll to the list of implementations and skip 50 implementations until you reach `UserError at the very bottom of the page :)
In general, everything seems much simple and easier to understand to me

There are a couple of design considerations: [Regarding transaction retries]

Sure, this is all correct and true, and the right way to do it in most cases. My use case is probably not as common as I somehow had thought. Basically, I'm able to "stop the world", and then update the database. So there isn't any other client that could make problems (and the transaction in question is fortunately pretty small).

When transaction retries there might be some different data in the database.

So why does sqlx, for example, not do the same thing that edgedb_tokio does?

I'm not remotely a database expert (normally, I only do extremely simple stuff with databases), so I'm not sure why there seems to be this difference between edgedb_tokio and sqlx.

If there is any error, I want to discard the whole update in progress

That said, you can turn off retries, if this what you meant.

What I meant, was, that the data synchronization should be atomic. So I don't want to end up with a dataset were, for example, only the first 80 percent was updated (which could happen, if I start several transactions).

Okay, I've experimented with error in transactions a bit.

This impl kinda works for anyhow, making it's easier to convert anyhow error into EdgeDB error:

impl<T> From<T> for Error
    where T: AsRef<dyn StdError + Send + Sync + 'static>
             + Send + Sync + 'static,

And it looks like using enum creates more issues than it solves. Feel free to experiment with the PR: https://github.com/edgedb/edgedb-rust/pull/208#issuecomment-1410405823

So nice miette printed errors are there. Feel free to open more errors on specific issues with this kind of error reporting.

edgedb / edgedb-rust

Unhelpful error messages (which contain Python tracebacks) #171