hometown-fork / hometown

A supported fork of Mastodon that provides local posting and a wider range of content types.
GNU Affero General Public License v3.0
736 stars 56 forks source link

Allow optional Markdown and/or ReStructuredText interpolation #1274

Open fogti opened 1 year ago

fogti commented 1 year ago

Pitch

Web-Interface support for a choice (similar to the choice for post visibility or post federation) to render a post via a Markdown->HTML or ReStructuredText->HTML transformation (support for quotes, headings, footnotes and links would suffice for me). HTML sanitizing should happen as usual (to protect against XSS, mostly). Support for embedded HTML itself is not necessary.

Motivation

although I'm assuming that this can already be achieved using the REST API, it would be pretty useful to have some bare-bones rendering in the web client, e.g. for properly stylized quotes (which would also be more readable) and subheadings...

dariusk commented 1 year ago

I'm not clear on what you're asking -- you are talking about "rendering" but it seems like you really mean writing? (We already render incoming rich text.)

Is the idea to write markdown (or ReStructuredText, which I have never heard of until this post) in the compose box and have it federate out as formatted text (really embedded HTML) like the glitch-soc fork does?

fogti commented 1 year ago

Is the idea to write markdown (or ReStructuredText, which I have never heard of until this post) in the compose box and have it federate out as formatted text (really embedded HTML) like the glitch-soc fork does?

yes

dariusk commented 1 year ago

I get asked to implement this a lot. I always have the following reservations. I also include a section at the end about moving forward in this conversation.

I'm worried about how markdown will get rendered on servers that don't support it.

The way most Markdown is federated, it is actually translated into HTML and federated as HTML, since that is what most servers will parse. So what happens is this:

Unlike the standard argument about markdown, where the failure case still carries a semblance of original semantic meaning, the post will be stripped of some semantic meaning upon receipt, which could result in communication misunderstandings due to a mismatch in formatting! Definitely something we don't want.

Now you could say this is a bug with Mastodon that they should fix (Hometown already parses and renders basic stuff like <em>) but I can't guarantee they would. I can really only control what Hometown servers send and receive.

I am concerned about usability

If I support markdown I would want a markdown editor in the compose box. Like highlight and toggle for bold, etc. The way it is supported in servers that I am aware of is basically "learn markdown or deal with not being able to write it". There do exist markdown compose box widget libraries. The problem with these is even the simplest ones are a god damned nightmare to make usable and maintainable, and they have tons of edge cases of their own, which is why so many implementors just say "text is good enough and people need to learn markdown". Then there is the problem of where I would fit this menu on mobile.

Also would we want an optional preview pane? How would we do that? It's not impossible but it's a really big change to make.

This is assuming we do what is suggested in this issue and make a toggle to enable markdown posts "similar to the choice for post visibility or post federation". That's Yet Another Toggle, which in other issues people are already explaining that there are too many (see #907). If we go with the more common "well just parse markdown if someone writes markdown", then I wonder what happens if someone doesn't even know what markdown is, and they make a post where they intend to have say underscores and asterisks behave as normal text, like they are making ASCII art or something? This would result in a post that would be really confusing for the user.

These are solvable problems, but they require both a large initial engineering and design investment, and also are going to add a lot of surface area to the code for things to go wrong and that will require maintenance from version to version. Even if a kind contributor makes a pull request to Hometown with some or all of this stuff implemented, I am still on the hook to maintain these major changes forever.

Moving forward

I am very open to suggestions that would reduce or limit scope of the feature (for example, asking to support a small subset of markdown).

I am open to design suggestions. One idea I had that could make this better is to make it a "power user" feature that someone has to turn on, like the Advanced Web Interface. So maybe it's a checkbox in settings where a user says "allow me to use Markdown in my posts" and then we just... parse what they write, godspeed dear user, just like glitch-soc does.

I am open to federation suggestions. No amount of UI design would fix the semantic loss when these messages arrive at a Mastodon server. There are some interesting options around mediaType in ActivityPub -- we could federate raw markdown and then hopefully servers that don't read it would just render the markdown? That is still pretty damn weird though, like this is valid markdown that looks good and renders the numbers 1, 2, 3 in order on the list:

1. one
1. two
1. three

But then it would "degenerate gracefully" to three 1's in a row. Maybe an edge case, but I guarantee you an edge case that I will have users asking me to fix!

ANYWAY. I hope to show with all the above that implementing Markdown, and rich text in general, is not actually a very easy thing. Write.as manages to do it but they pay the price by having their rich text posts essentially not rendered at all by Mastodon! You end up having to click a link through instead, nothing renders in feeds. And of course they have the advantage on the UI side of having a full blogging software interface. (We do render their posts nicely here on Hometown.)

aredridel commented 1 year ago

I can only agree — it's really difficult. I honestly think sending the raw markdown as plain text is one of the best options, with HTML in a separate field for clients that support it. It's bulky and bad, but preserves a plain text that's meaningful, if ugly, and uses HTML as the lingua franca of the web that it is. It leaves the nasty attack surface of sanitizing HTML, sadly, but that is unfortunately the stakes. It also lets the semantics be roughly fixed at posting time, where markdown interpretation is prone to divergence and evolution over time, where HTML is much more carefully backward compatible, if not always prettily rendered.

It's such a mess to do and do well.

WesleyAC commented 1 year ago

I personally support having a option buried in the user preferences somewhere that explains these problems and then just parses whatever text people type in as markdown and federates html (or if you feel like being fancy, adds a dropdown by the compose box for selecting text/markdown/html, like glitch-soc).

IMO federating markdown is actually even more of a interoperability nightmare than federating html, since different parsers handle edge cases quite differently, and there is no markdown specification. This isn't just a theoretical issue, it's something anyone who regularly tries to render the same text in multiple markdown parsers will run into. With html, a server at least has the ability to render it as is and display a message to the user if there's something unsupported (even if that's not the case presently), whereas if the ecosystem goes down the path of federating markdown, there's really no future path towards standardization.

I do appreciate that this is hard and all of the solutions are pretty bad, though IMO it's worth keeping in mind that the status quo of not allowing users to write semantic emphasis/lists/links etc is bad as well.

aschrijver commented 1 year ago

Just wanted to mention GoToSocial's markdown support, and - given multiple apps considering / supportin markdown - that a Fediverse Enhancement Proposal may be written, to ensure everyone does it more or less in similar ways. That would also be an incentive to adopt, and not look overly much at Mastodon as elephant in the room. Mastodon may get the same incentive to adopt, once multiple apps are over the bridge.

dariusk commented 1 year ago

@aschrijver An FEP seems like the way to go!

I'd be curious to see how GoToSocial federates that stuff, as it's not mentioned in the document you link. Guess I'll poke at the code.

dariusk commented 1 year ago

Someone pointed out to me that the ActivityPub source property is likely the way to federate this stuff:

https://www.w3.org/TR/activitypub/#source-property

{
  "@context": ["https://www.w3.org/ns/activitystreams",
               {"@language": "en"}],
  "type": "Note",
  "id": "http://postparty.example/p/2415",
  "content": "<p>I <em>really</em> like strawberries!</p>",
  "source": {
    "content": "I *really* like strawberries!",
    "mediaType": "text/markdown"}
}
fogti commented 1 year ago

the fact the HTML needs to be sanitized is well-known (mostly to prevent XSS); and that part ("what if the other side mangles the HTML of the post") sounds like out-of-scope (although it might be appropriate to do a FEF about it), as ActivityPub to me appears to be specifically designed to deal with the problem of "the other side can't possible know all possible input formats and how to render them" by simply being like "well, thus, we always attach the HTML rendering of the source". If the other side mangles the thus-attached HTML, that is simply a problem of the other side (although I think there should be a set of HTML tags which are designated "safe" and won't be mangled by most implementations).

cblgh commented 1 year ago

great writeup on the intricacies @dariusk (in particular the loss of context!)

one alternative, which i have seen others go for, could be to not support markdown but instead gemtext. it is basically a tweaked subset of markdown, and seems like it could fit the mastodon's posting context a lot better than markdown itself

marrus-sh commented 1 year ago

comments based on my experience with glitchsoc markdown, etc :—

nachtjasmin commented 11 months ago

The problem of upstream Mastodon only supporting a limited subset of HTML is not something you’re going to get fixed, because Eugen wants the Mastodon user experience to be like Twitter in its plaintextiness. (I mean, anything could happen, but I’m not optimistic.) This definitely introduces problems with things like Markdown strikethroughs which don’t render as strikethroughs on vanilla Mastodon servers. It also causes problems in clients (like Toot!) which also don’t support additional HTML elements.

Just as a small note: With Mastodon v4.2-beta1 and newer versions, Mastodon supports a larger HTML subset (see https://github.com/mastodon/mastodon/pull/23913). So, at least this problem is solving itself in the near future.