TypeCellOS / BlockNote

A React Rich Text Editor that's block-based (Notion style) and extensible. Built on top of Prosemirror and Tiptap.
https://www.blocknotejs.org/
Mozilla Public License 2.0
6.02k stars 396 forks source link

Handle Text Alignment when parsing HTML #636

Open leo-paz opened 4 months ago

leo-paz commented 4 months ago

Is your feature request related to a problem? Please describe. When copying over work from a .docx document in a word editor text alignment is never respected. Everything becomes left aligned. I'm assuming this would involve changes to how HTML is parsed.

Describe the solution you'd like Would be nice for it to respect the text alignment given to it in the initial editor

Additional context image image

linxianxi commented 1 month ago

same here

matthewlipski commented 1 month ago

This is because unfortunately there isn't a lot of consistency with how text styles are handled when copying to the clipboard for different editors. Things like bold and italic are no problem since they have corresponding HTML tags, but text color and alignment could be marked using e.g. data- attributes or inline styles. So basically we would have to make specific cases for various popular editors, which we haven't gotten around to yet.

cmawhorter commented 2 weeks ago

is there a reason tiptap's generateHTML isn't used? from a quick look at the blocknote code, it seems like blocks are tiptap nodes, which are just prosemirror nodes.

i've been evaluating replacements for a very old custom tinymce editor and i was kinda surprised by the lack of support for html. but after finding this issue and re-reading the docs, it seems i misunderstood the goal of blocknote which is to be a literal rich-text editor, and not html content authoring/editing.

that also explains why the docs recommend persisting document json, which i was initially confused by.

that's a shame because blocknote is essentially drop-in if you already use one of the supported ui component libs.

this is my ignorance showing, but surely tiptap or pm has already solved this problem of transparently converting between html and blocks/nodes?

matthewlipski commented 1 week ago

So converting between HTML and JSON is indeed handled by TipTap/ProseMirror, the issue is that you still need to give it instructions for how to parse the HTML. When you're working within the same schema, this is pretty cut and dry as you know how each node is represented in HTML as well as JSON. However, when inserting external HTML into your document, from Word of other editors/websites, you don't necessarily know how that information is represented.

For example, Google Docs shoves a whole bunch on inline styles into their HTML to make it match the actual document as closely as possible, whereas Notion doesn't have any information regarding things like text/bg colors in their HTML. So you kind of have to tell TipTap/PM how to parse certain nodes (or marks in this example) by accounting for these cases individually.

YousefED commented 1 week ago

@cmawhorter you might be interested in the recently release server-util package for this as well: https://www.blocknotejs.org/docs/editor-api/server-processing

cmawhorter commented 1 week ago

you guys are on the ball. nice! i'm going to take a look at the server-util package and it looks promising. thanks for your work on it.