deltachat / deltachat-core-rust

Delta Chat Rust Core library, used by Android/iOS/desktop apps, bindings and bots 📧
https://delta.chat/en/contribute
Other
658 stars 84 forks source link

Integrate message parser into the core library #4613

Open link2xt opened 1 year ago

link2xt commented 1 year ago

Message parser is a crate used by Delta Chat Desktop to parse markdown-like formatting in the messages: https://github.com/deltachat/message-parser It is currently used by Delta Chat Desktop if experimental setting "Render Markdown in Messages" is enabled.

Delta Chat Android currently uses regexp-based parsing to highlight URLs and DeltaLab is using a Java library to parse markdown.

To use message parser on Android we need a C API similar to dc_msg_get_text that returns the result of parsing the lightweight markup of the message. To simplify integration into Android client and possibly use the same API in DeltaTouch we need API that returns HTML markup, so API can have the following signature:

char*           dc_msg_get_html               (const dc_msg_t* msg);

Note that it is different from dc_get_msg_html() which returns the full message for display in the browser.

As a first step it is decided to parse markdown from text/plain part and not change the structure of the MIME message, so the only change on the UI side is to display an HTML in the message bubble if dc_msg_get_html returns a non-empty string. In the future it may be possible to start sending HTML for display in classic MUA and integrate WYSIWYG editor into the text field similar to one used in XMPP clients supporting XHTML-IM, Telegram etc., but this is outside of the scope of this issue and not necessary for the first step.

(as discussed with @adbenitez and @Simon-Laux)

link2xt commented 1 year ago

We may also want to add JSON-RPC API for desktop to avoid duplication of the code, otherwise Desktop will ship with both the message parser compiled into the core library and WASM.

link2xt commented 1 year ago

I opened an issue in the message-parser repository for adding an HTML output on the Rust side: https://github.com/deltachat/message-parser/issues/37

hpk42 commented 1 year ago

On Sun, Aug 06, 2023 at 16:17 -0700, link2xt wrote:

char dc_msg_get_html (const dc_msg_t msg);


Note that it is different from `dc_get_msg_html()` which returns the full message for display in the browser.

maybe worthwhile to give it a name like dc_msg_as_html to make it more distinct?

r10s commented 1 year ago

rendering HTML in the bubbles is not directly possible on android and ios.

android uses instead sth. as SpannableString, ios has NSAttributedText & co. in contrast to desktop, there is not much HTML outside webview - which, in return, probably cannot be used for bubbles for performance reasons.

therefore, having a function that returns HTML that then needs to be parsed again to sth. else is not that helpful - and also a waste of performance if the HTML was just generated in core (the waste would also be there if there is a HTML-to-SpannableString/AttributedText function). let alone expectations when passing HTML to bubbles :)

instead, we need sth. more on-point, maybe already character-index based, a tree or list or so, maybe in JSON. sth. that can be converted easily at best in a simple loop. generating HTML from that data should be straight forward then (generating HTML is easy :)

Simon-Laux commented 1 year ago

It is currently used by Delta Chat Desktop if experimental setting "Render Markdown in Messages" is enabled.

No, it is always used in desktop, there is a markdown mode that can be enabled, the more important feature is parsing links, email addresses, hashtags, bot command suggestions, labeled links, and later stuff like mentions.

It is not only about markdown, also the first step should not even be markdown as stabilising it from its experimental state opens more questions.

Deltachat iOS is also using something regex based: https://github.com/deltachat/deltachat-ios/blob/d8f40df939ddf5a04d96ab90b94125fc8c5c4e0e/deltachat-ios/Chat/Views/DetectorType.swift#L37

Desktop does not want HTML output it already works with json: it uses the json serialised output (which is a tree) of the message parser to build together react elements -> https://github.com/deltachat/deltachat-desktop/blob/5103652009ad4078c5c1878ec17014377e927794/src/renderer/components/message/MessageMarkdown.tsx#L30 changing this to HTML makes no sense because, A. it opens XSS sanitation questions in core and B. makes custom elements, styles and behaviour way more difficult. Doing that makes no sense if we already have a solution that works perfectly fine.

The HTML idea was that deltachat android could easily display basic html in a label, because @adbenitez said that might be possible. My initial idea based on talking with @Hocuri was converting the output tree to spanable text. There might be issues with non inline markdown such as code blocks, but we shouldn't start with or focus on markdown anyway.

r10s commented 1 year ago

My initial idea based on talking with @Hocuri was converting the output tree to spanable text

that would also be the way i would think about that at the first place, for various reasons (less parsing, less problems, less code, no html-intermediate-format generation needed, performance)