nextstrain / augur

Pipeline components for real-time phylodynamic analysis
https://docs.nextstrain.org/projects/augur/
GNU Affero General Public License v3.0
268 stars 129 forks source link

Add a debug mode to print stack trace for exceptions caught by top-level handler #1308

Closed corneliusroemer closed 4 weeks ago

corneliusroemer commented 1 year ago

I got this error from refine that I struggled to debug:

$ augur refine --tree builds/tree_raw.nwk --alignment test.fasta  --output-tree builds/tree.nwk
...
ERROR: SequenceData: loading alignment failed... builds/masked.fasta

ERROR from TreeTime: This error is most likely due to a problem with your input data.
Please check your input data and try again. If you continue to have problems, please open a new issue including
the original command and the error above:  <https://github.com/nextstrain/augur/issues/new/choose> 
I didn't know why the alignment couldn't be loaded.

Turns out, my sequences weren't aligned, they had different lengths. I'll try to improve treetime's error message, see https://github.com/neherlab/treetime/issues/258.

But while debugging and trying to find out what went wrong, I got quite frustrated by how hard it was to get a stacktrace of how the error originated. (Maybe @tsibley has some clues on how one can get a stacktrace here in this case without having to change any code, maybe some ENV variable trick?)

I think one problem is that we just blanket wrap all treetime errors and discard the stacktrace. Maybe we print the stack trace behind a verbosity flag here, i.e. add traceback.print_exc(file=sys.stderr) but only when -v is passed. https://github.com/nextstrain/augur/blob/854dc7de9f99809e6187fbe79f9cf481cc0cd607/augur/__init__.py#L76-L92

ivan-aksamentov commented 1 year ago

One nice way is to use exception chaining.

try:
  some_code()
except ConcreteException as e:
  raise FinalError(f"There was a concrete error and here is some recommendations: ...") from e

(notice raise ... from ...)

I think this is the recommended way to express causality of errors in Python nowadays. The default handler then prints all errors in the chain, with stacks, so there's no need for handcrafted solutions.

Perhaps the default handler printout not as nice, but then the custom handler can also be augmented with understanding chains.

I am not sure when it was introduced though and so what's the minimum Python version requirement.

tsibley commented 1 year ago

Yes, exceptions should be chained (except in rare cases), and the default handler should know how to emit the full chain with all the gory details when a debug or verbosity flag is set. This is what Nextstrain CLI does, for example.

In a case where you couldn't (or didn't want to) just modify the source, you could use the debugger.

victorlin commented 1 year ago

It sounds like the missing piece here is a debug/verbosity flag. Right now the stack trace is only printed on unhandled exceptions because we definitely want that in an issue report. With errors such as AugurError which are raised internally, the message is clear and doesn't need a stack trace to go with it, at least without a --debug flag.

With TreeTimeError it seems like both unhandled exceptions and expected errors can be represented. The default behavior can be no stack trace, which is printed with --debug. The "report an issue" text can be modified to something like:

please open a new issue including the original command and the full error when run with --debug.

tsibley commented 1 year ago

@victorlin unfinished comment?

I'd recommend the debug flag live in the environment rather than a command-line option, as it makes it much easier to set in a wide variety of contexts (e.g. places you can't easily control the command-line invocation) and doesn't have issues of not applying before option parsing.

victorlin commented 1 year ago

@tsibley oops, fixed!

That sounds reasonable. I'd think to basically mirror what you did in https://github.com/nextstrain/cli/pull/208.

corneliusroemer commented 1 year ago

@victorlin There are two types of tree time error: TreeTimeError and TreeTimeUnknownError - we might want to show stack trace in one by default, depends on the semantics. Exposing stack trace with env variable sounds good, would be great to report this in the error message (for more info, rerun with AUGURDEBUG=true augur refine ...) as you've suggested