RFC: Should `markdown` raise custom exceptions?

der-gabe commented 2 years ago

I recently reviewed some code like this:

try:
    markdown.markdown(…)
except:
    return 'Rendering failed!'

… and I was not happy with the overly broad exception clause.

I decided to make a concrete suggestion for improvement by at least catching the markdown exception base class when I discovered that none exists and that markdown raises very generic Python exceptions, such as TypeError, ValueError etc.

I was surprised by this, as I'm used to Python libraries using their own custom exception hierarchies and not doing so encourages users to write code like the one above.

So I wanted to suggest introducing a custom exception hierarchy for markdown.

I'd volunteer to make the changes, as they would be rather "mechanical" and do not require deep familiarity with the code, but I wanted to trigger a little discussion as to the merits of this idea, before doing all the work and it opssibly getting rejected.

waylan commented 2 years ago

First of all, Markdown actually raises very few exceptions at all, and all of the exceptions we do raise fit into the already existing basic types provided by Python.

Second, the goal is that no exceptions are ever raised during parsing of the Markdown source. As a reminder, Markdown parsing is often done by web servers and an error raised by some badly formed Markdown syntax could crash the server (depending on how Python is called from the server). Therefore, bad syntax should never result in an error.

This can best be illustrated by using the Markdown class:

md = markdown.Markdown(**options)
html = md. convert(src)

The first line may raise exceptions. This is most often the result of extensions failing to load properly. The second line should never raise any exceptions. Instead it should fail silently and pass on incomplete/erroneous HTML which will clue the user into the fact that they have bad Markdown.

The only way your proposal makes sense is if all extensions (including the long list of existing third party extensions) used the same custom exceptions. And based on past experience, that is not reasonable to expect. Therefore, I see no value in it.

But shouldn't the user catch unexpected exceptions on the convert line? Yes, absolutely; especially if they want to protect their server from crashing. But those exceptions would not be something that we are explicitly raising. They are unexpected after all. Therefore, a custom exception type has no value there.

der-gabe commented 2 years ago

First of all, Markdown actually raises very few exceptions at all, and all of the exceptions we do raise fit into the already existing basic types provided by Python.

That may well be so, but it does not seem to be documented which exceptions (if any) one might expect to be raised from any given call, which makes handling them harder.

A custom base exception class could help here, but so - of course - could adding to the docs.

Otherwise, users are kind of pushed in a direction of very broad exception clauses and silencing their linters about it.

(It is true, from what I have seen, that it is much more common for markdown to re-raise exceptions it encounters anyway ("along the way", as it were, such as here, here, here and here, in the core module) than it is for it to raise its own exceptions (e.g. here and here in the same file), but a custom class could be used in both cases.)

Second, the goal is that no exceptions are ever raised during parsing of the Markdown source. [...]

I understand this (now) and totally agree that it's the right way to go.

Yet, unexpected circumstances can and do happen and exceptions do get raised for them.

The only way your proposal makes sense is if all extensions (including the long list of existing third party extensions) used the same custom exceptions. And based on past experience, that is not reasonable to expect.

This is undoubtedly true and I hadn't thought about that. I suppose there is no way for markdown to catch exceptions raised from extensions and wrap them in its own custom class?

But shouldn't the user catch unexpected exceptions on the convert line? Yes, absolutely; especially if they want to protect their server from crashing. But those exceptions would not be something that we are explicitly raising. They are unexpected after all. Therefore, a custom exception type has no value there.

So if I understand you correctly, your deeper argument is that the user should not expect any particular exception to be raised by markdown, so it wouldn't make sense to have custom ones, as no special handling or recovery is possible in these rare cases.

Rather, users should treat all exceptions from markdown as unexpected and handle them in a generic way (only to avoid crashing the process/server). Is that right?

waylan commented 2 years ago

I just don't see the value. Every instance where we raise/reraise an exception, the exception type is exactly on-point for that case. For example, if the user passes in an unknown extension name, then they will get as AttributeError, if the user attempts to load some non-extension class as an extension they will get a TypeError. No surprises there. As a user myself, I would much prefer that to a generic MarkdownException which relays no information about what I did wrong. Yes, we could create a whole bunch of different custom exception types for each scenario, but most of them would only ever be used once.

I think the correct way to address this is with documentation. However, I think that should probably go in the (currently non-existent) API docs. See #1220 for details.

der-gabe commented 2 years ago

Personally, I don't think single-use custom exceptions are a bad thing, but I get where you're coming from.

Either way, though, documentation is key here. You're right about that.

Personally, I might add it to the Details section of the regular Library Reference, but whatever: as long as it's there, it can be found by search.

I had a quick glance at the other PR and if I understand correctly then that would imply adding which method may raise what exception to the relevant docstrings, which should then get auto-collected.

I'll keep an eye on that one. Feel free to close this PR, as you see fit.

Python-Markdown / markdown

RFC: Should `markdown` raise custom exceptions? #1295