element-hq / element-web

A glossy Matrix collaboration client for the web.
https://element.io
GNU Affero General Public License v3.0
11.02k stars 1.96k forks source link

Support MathJax for chat messages! #1945

Closed ghost closed 3 years ago

ghost commented 8 years ago

This is a feature request, which I'll admit is probably not too important for most users.

I think it would be really amazing if text in between $dollar signs$ could be rendered as inline math formulas using MathJax, similar to what you see on some of the stackexchange sites. It's also common to use $$double dollar signs$$ to denote full equation blocks.

This may obe an easy addition, seeing as markdown is already supported. With names like "matrix" and "vector", there's bound to be at least a small subset of users who actually use this chat client to talk about math! It's hopefully at least worth considering, anyway.

ara4n commented 8 years ago

Thanks for the feature request - agreed this would be very cool. What typically goes between the dollar signs? MathML? TeX? ASCIImath?

On 10/08/2016 19:38, Jesse Maes wrote:

This is a feature request, which I'll admit is probably not too important for most users.

I think it would be really amazing if text in between |$dollar signs$| could be rendered as inline math formulas using MathJax https://www.mathjax.org/, similar to what you see on some of the stackexchange sites. It's also common to use |$$double dollar signs$$| to denote full equation blocks.

This may obe an easy addition, seeing as markdown is already supported. With names like "matrix" and "vector", there's bound to be at least a small subset of users who actually use this chat client to talk about math! It's hopefully at least worth considering, anyway.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/vector-im/vector-web/issues/1945, or mute the thread https://github.com/notifications/unsubscribe-auth/ABO_vZsg18vTcaqttEVn7_YKZBAeGrB0ks5qem7tgaJpZM4Jhta4.

ghost commented 8 years ago

I'm definitely most used to seeing TeX notation for inline math. As far as I know it's the most commonly used and versatile of those three - I've never found anything it couldn't write down. It's also the only one I've used though, so maybe I'm biased there.

jansol commented 8 years ago

As a CS student with a friend circle of massively nerdy people (math, physics, programming etc) I certainly would have some uses for this, although it's by no means critical.

+1 for TeX

kykc commented 8 years ago

As a DSP engineer I would much appreciate this feature, as sharing math with remote co-workers and friends has been rather painful for a long time. +1 for TeX markdown-preview-plus from Atom text editor supports something like this, btw.

saad440 commented 7 years ago

So... When are we going to have it? :) It will really come in handy in the physics chat. +1 for TeX

joelostblom commented 7 years ago

It could be worthwhile looking into Khan Academy's KaTeX as a speedy, lightweight alternative to MathJax.

uhoreg commented 7 years ago

As far as encoding the math in the message event goes, I think that it makes sense to send it as MathML (obviously in a "format": "org.matrix.custom.html" message). MathJax can be used to convert various formats (including LaTeX) into MathML, and it can be used to display the MathML in browsers that don't support MathML. An alternative to sending MathML would be to stick LaTeX math inside a special element. For example, MathJax uses a <script type="math/tex"> element to indicate math text, or it could be a <span> with a special class..

Pros of sending MathML as opposed to sending the raw LaTeX wrapped in an element:

Cons of sending MathML:

This would all also depend on what can be used to do math on iOS and Android.

Note: MathML allows embedding the original LaTeX source via annotations

(BTW, a note about KaTeX: it is much faster than MathJax, but also doesn't support as much LaTeX as MathJax does. What we do on our site at work is that for each chunk of math, we first run it through KaTeX, and if it errors, then we run MathJax on it.)

geez0x1 commented 7 years ago

Robotics engineer here, yes please!

LaTeX as default would definitely be most sensible I reckon; I don't think any of my colleagues know MathML or ASCIImath. Every single one knows LaTeX.

quell- commented 7 years ago

As a CS researcher, I would highly appreciate this feature. Including math formulas in our daily conversations is very common among my colleagues. It would be nice if we could do it instantly in chat flow. +1 for TeX

Cadair commented 7 years ago

I would love to see this!

ara4n commented 7 years ago

I would too!!!! But we have no choice but throw ourselves on the generousity and creativity of the internetz for this one...

Patches welcome!!!

MTRNord commented 7 years ago

Just throwing a possible dep: https://www.npmjs.com/package/react-mathjax Not sure yet how to use it in riot-web exactly but seems to be pretty simple.

hgustafsson commented 7 years ago

I made a proof of concept where the client renders math formulas between $ ... $ or $$ ... $$ using KaTeX on https://github.com/hgustafsson/matrix-react-sdk/tree/hgustafsson/katex https://github.com/hgustafsson/riot-web/tree/hgustafsson/katex

Ideally, there should be a room-specific option for turning this on and off (probably off by default), and for specifying different delimiters, but I don't know how to do that.

I was also playing with the idea that the sender generates an already formatted message (formatted_body) in html or mathml using KaTeX, but the riot-web receiver removed these tags in the html sanitizer of the function bodyToHtml.

Cadair commented 7 years ago

From an inter-client compatibility standpoint transforming to html on the way out makes sense to me? In that if (not riot) client B can render html then if riot with KaTeX sends html it will display as html and not latex source.

t3chguy commented 7 years ago

@Cadair for consistency the HTML transform is always done by the sender

Cadair commented 7 years ago

@t3chguy sure, that makes sense. @hgustafsson was saying that that approach isn't working.

t3chguy commented 7 years ago

yeah, currently the subset of allowed HTML is very strict, it may need to be loosened or another format being added to support it, currently the format is a custom html one, adding another would be pretty simple

hgustafsson commented 7 years ago

@Cadair @t3chguy I agree that this would be the best, but all markup was removed by the html sanitizer. I'm not confident enough with the security implications to change the allowed tags and attributes in sanitizeHtmlParams of HtmlUtils.js. (The same security implications would apply even if we make another message type/format for math which would allow more tags.)

The result of, for example, katex.renderToString("\cos(x)") looks like

<span class="katex-display">
    <span class="katex">
        <span class="katex-mathml">
            <math><semantics><mrow><mi>cos</mi><mo>(</mo><mi>x</mi><mo>)</mo></mrow><annotation encoding="application/x-tex">\cos(x)</annotation></semantics></math>
        </span>
        <span class="katex-html" aria-hidden="true">
            <span class="strut" style="height: 0.75em;"></span>
            <span class="strut bottom" style="height: 1em; vertical-align: -0.25em;"></span>
            <span class="base">
                <span class="mop">cos</span>
                <span class="mopen">(</span>
                <span class="mord mathit">x</span>
                <span class="mclose">)</span>
            </span>
        </span>
    </span>
</span>

Unfortunately, all the work I did in this direction was when marked was still used for Markdown for which I could use ViktorQvarfordt/marked and write a custom rendering engine for math using KaTeX. Hopefully something similar can be done with commonmark but I haven't looked at it.

t3chguy commented 7 years ago

then make an alternate format which is done client side, formatted_body being the LaTeX formatting and render clientside, assuming that the KaTeX implementation is sane and 100% consistent

CsatiZoltan commented 6 years ago

Any progress in it?

RoyiAvital commented 6 years ago

Please don't use KaTeX. Use MathJaX as it is much more feature rich.

uhoreg commented 6 years ago

In my experience, MathJaX covers much more LaTeX than KaTeX does, but KaTeX still covers the vast majority of what most people need. However KaTeX is much faster than MathJaX (though I believe the MathJaX developers have been working on improving speed lately). Also, MathJaX would randomly fail some times when we had a lot of math. So what we did was we rendered math in KaTeX, and if KaTeX was unable to render it, then we fell back to MathJaX.

jeanm commented 6 years ago

My experience matches @uhoreg's, and I would urge whoever ends up developing this to consider using KaTeX by default.

KaTeX is much faster, and does not have the annoying reflowing issues. MathJax's performance made it almost unusable on my old Android phone. KaTeX also seems to cover the vast majority of what people normally need (at least in physics and compsci). New commands are constantly being added as the project is very active.

uhoreg commented 6 years ago

<braindump> (or maybe \begin{braindump}) In order for this to happen, several tasks need to be done:

  1. decide on the message format for the math. The main options seem to be: LaTeX within some sort of container, MathML, or some pre-rendered format. Some considerations for this are:
    • how well the format is supported by different platforms (web, Android, iOS, GTK+, Qt, console, etc.), and how well the libraries work (in terms of bloat, speed, etc).
    • how well the format degrades for clients that don't support math
    • accessibility
    • how much of the format should be supported (e.g. for LaTeX, it probably makes sense to not support things like \newcomand. But should it support AMSMath? For MathML, it may make sense to just limit it to Presentation or Content MathML. etc.)
  2. decide how to input math. This may involve both a Markdown and a non-Markdown method. For Markdown, the obvious solution is to enclose math (possibly LaTeX) within some delimiters (though using the traditional $...$ notation may cause non-math to be accidentally matched, so it may be better to use \(...\) instead). AsciiMath might be another option for hand-typed math. For a user-friendly, glossy math entry, something like MathQuill might be worth looking at.
  3. decide which libraries to use. For web, the main contenders are MathJax and KaTeX. MathJax supports more of LaTeX, but KaTeX may be sufficient for most people and is more lightweight. MathJax also supports AsciiMath and MathML as input.
  4. decide if the math libraries should be lazy loaded.

This issue needs someone to work through all of these things and write out a proposal.

\end{braindump} (or </braindump>)

Evidlo commented 6 years ago

I think Tex would ideally be transmitted in its raw format using extensible events.

Cadair commented 6 years ago

While I agree, there is no need to block this on that proposal becoming reality.

hhassey commented 6 years ago

Any updates on LaTeX being supported?

Evidlo commented 6 years ago

Here's a demo which compares Katex and Latex performance.

Also I think having a delimiter for inputting Latex outside of Markdown mode is unnecessary.

Mathquill looks really nice.

RoyiAvital commented 6 years ago

@Evidlo , This test is spread by KaTeX.
KaTeX is faster, yet in real world does it make any difference? In my opinion, No.

I would chose MathJaX over KaTeX any time any day.

saad440 commented 6 years ago

Riot is already loading a huge pile of JavaScript. What difference is a tiny bit of MathJax going to make. 😃

saad440 commented 6 years ago

But even getting KaTeX will make me more than happy because we will have something.

geez0x1 commented 6 years ago

Riot needs to get faster, not slower!

On that note, in the aforementioned demo/benchmark KaTeX is over 10 times as fast for me, across three browsers. I'm unsure why but it seems MathJaX isn't very efficient.. I have no idea how they compare in terms of features though.

Cadair commented 6 years ago

There is a lot more to this issue than a selection of MathJax vs KaTeX, the hardest things to sort out is how the messages will be encoded and how fall backs will be provided. If done well the section of the actual js library should be entirely down the the client implementation.

Evidlo commented 5 years ago

There is a lot more to this issue than a selection of MathJax vs KaTeX, the hardest things to sort out is how the messages will be encoded and how fall backs will be provided. If done well the section of the actual js library should be entirely down the the client implementation.

I thought the idea was to not block this while waiting for extensible events?

uhoreg commented 5 years ago

Regardless of whether it comes before or after extensible events, we still need to decide on how the math is transmitted. https://github.com/vector-im/riot-web/issues/1945#issuecomment-387925133 outlines what needs to be done.

inducer commented 5 years ago

I think extensible events might be too restrictive, as they limit the granularity of switching math typesetting on/off to that of a message. In practice, inline math is far more useful IMO. One might use a separate message type for 'may contain math' though.

As for encoding, a reasonable option might be 'an implementation-defined subset of AMSLatex', with fallback to showing the raw markup.

Evidlo commented 5 years ago

Regardless of whether it comes before or after extensible events, we still need to decide on how the math is transmitted. #1945 (comment) outlines what needs to be done.

  1. I say leave the TeX as-is for transmission (in some container). As for devices that don't support rendering, just display the TeX. The sort of people who will be using this are going to be familiar with reading TeX source anyways.

  2. The most obvious is $ ... $ with a button for enabling Latex, similar to the Markdown button. I don't think we should worry about fancy input options like MathQuill right now.

  3. I vote KaTeX. For those opposed, are you able to think of any declarations or commands off the top of your head which are not in this list? I couldn't.

  4. Since a TeX renderer is more heavyweight than markdown, it might not make sense to have it enabled all the time. This could be a checkbox under USER INTERFACE in the global Riot settings. Alternatively, you might enable KaTeX automatically if there is a TeX message on the current page.

sylph1o commented 5 years ago
  1. I vote KaTeX. For those opposed, are you able to think of any declarations or commands off the top of your head which are not in this list? I couldn't.

This page lists unsupported LaTeX features in KaTeX. Just a relevant information to have here.

Edit: Hit Enter too soon.

saad440 commented 5 years ago

I think KaTeX is good enough. We won't need a chat client to provide the most comprehensive LaTeX support. We can temporarily switch to other methods, such as attaching documents, to discuss more complex ideas. I also agree with @Evidlo 's "leave the TeX as-is in a container" idea. Those who deal with math should be familiar enough to recognize it in case the client does not support it.

Cadair commented 5 years ago

I also agree that without extensible events the best thing to do is to leave the on-wire format as the raw tex source. I think with extensible events we could provide the maths in a couple of useful formats.

uhoreg commented 5 years ago

I've been slowly writing a MSC over the past week to answer question 1 in https://github.com/vector-im/riot-web/issues/1945#issuecomment-387925133. The result is https://github.com/matrix-org/matrix-doc/pull/1722.

thosgood commented 5 years ago

are there any updates on this?

uhoreg commented 5 years ago

are there any updates on this?

Unless there is a secret cabal of mathematicians plotting somewhere, I'm not aware of anyone who has done any work on this recently. It's on my to-do list to look at this again after I'm done cross-signing, but that list is getting pretty long, so if anyone wants to jump on it, feel free to.

thosgood commented 5 years ago

are there any updates on this?

Unless there is a secret cabal of mathematicians plotting somewhere, I'm not aware of anyone who has done any work on this recently. It's on my to-do list to look at this again after I'm done cross-signing, but that list is getting pretty long, so if anyone wants to jump on it, feel free to.

I would love to be such a secret mathematician, but am not too sure what the current status is/what needs to be done. Does Matrix itself support displaying maths in messages? https://github.com/matrix-org/matrix-doc/pull/1722 seems to still be open.

uhoreg commented 5 years ago

OK, so what needs to be done is:

  1. consensus needs to be reached on matrix-org/matrix-doc#1722. The debate over LaTeX vs MathML is the main issue (most people seem OK with MathML, but further input is welcome). But further feedback on everything listed under "potential issues" would also be helpful.
  2. determine how to enter math (this is not necessarily blocked on # 1, as long as the input method can generate whatever format is needed), and implement it
    • is it possible to add stuff to the commonmark library so that markdown users can enter math?
    • is it possible to add a fancy GUI math editor to the rich text editor?
  3. determine how to display math (MathJax or KaTeX? If we're using MathML as the over-the-wire format, then we're stuck with MathJax, so this is somewhat blocked on # 1), and implement it
  4. bonus points if you can do # 2 and 3 in such a way that they're lazy-loaded, so that people who don't need math won't get their Riots bloated.

# 1 is the main part that I'm going to be looking at. Someone else could probably start looking at # 2 and 3 right away, even though matrix-org/matrix-doc#1722 isn't finalized, as the hard part (IMHO) is integrating everything into Riot, and it should not be too hard to switch formats if needed, if it is done in a way that's flexible. Though that's just a guess on my part.

thosgood commented 5 years ago

in regards to the first two points, it seems like converting from LaTeX syntax to MathML is reasonably simple (in particular, there are already a bunch of libraries online that do such things), so in terms of input, I would say that the standard "put maths between $ delimiters in LaTeX syntax" would be enough? I'm sure that a fancy GUI would be nice, but I feel like the target audience for this capability would mostly be happy without such a thing.

uhoreg commented 5 years ago

The standard "put LaTeX inside delimiters" would certainly work as a first cut. I'd be a bit cautious of just using plain $, as there is a chance of accidentally triggering math when it isn't intended, but there are ways to reduce that (e.g. only triggering if there are no spaces at the beginning/end of the math). A GUI, even a simple one, would be helpful for some users, as not all people who want to enter math are as experienced with LaTeX (e.g. student asking questions from an instructor), but could be added later.

dalcde commented 5 years ago

Using \( and \) instead of $ $ could be a good alternative.

Evidlo commented 5 years ago

@uhoreg Just have a LaTeX button next to the markdown button for enabling $ $. I'm not a fan of ( ).

thosgood commented 5 years ago

The standard "put LaTeX inside delimiters" would certainly work as a first cut. I'd be a bit cautious of just using plain $, as there is a chance of accidentally triggering math when it isn't intended, but there are ways to reduce that (e.g. only triggering if there are no spaces at the beginning/end of the math). A GUI, even a simple one, would be helpful for some users, as not all people who want to enter math are as experienced with LaTeX (e.g. student asking questions from an instructor), but could be added later.

Yeah, adding this in later seems like a good idea. I think it would be good to just get some sort of basic support first, because I really think it would attract quite a few people to Riot.

Edit: Having reread this thread, I just wanted to contribute towards the MathML vs LaTeX encoding debate: I think it would be lovely to extend Matrix HTML support to cover Presentation MathML, but it seems like the idea of just wrapping the LaTeX-format input and sending this, and just displaying this as a fallback, seems like a perfectly workable idea. Every mathematician that I know already just types things like $\mathcal{C}(a,b)\xrightarrow{\sim}Y$ in emails/whatsapp messages, so this would be an acceptable fallback. This also means that (as far as I can tell) the choice of KaTeX vs. MathJax would be rendered almost entirely arbitrary by just letting the user pick between the two.