Closed zurfyx closed 1 year ago
@acywatson I wonder if you have some thoughts on this
I do - there’s already an issue about renderToString, which is basically exporting to HTML.
https://github.com/facebook/lexical/issues/542
There are some gotchas around it, but this was a highly demanded feature in Draft JS that never got implemented. I think we should try to do it, but maybe merge the issues?
Maybe we can use this issue? It’s a little more clear what’s being asked and includes the import piece
No comments
@zurfyx @acywatson I built something like this for copy + paste, it converts the selected content to HTML which uses the new exportHTML
methods on LexicalNodes.
You can check out the code here, maybe this solves the use case?
Tricky thing for us, originally, was styles, I think. It's hard to render the styles, since the editor is using classes for theming.
I guess it depends on what needs to be exported- the general shape of the editor's contents (paragraphs, images, tables, etc) or an exact snapshot of what the user is viewing (inline styles, etc).
Either way, I think this can be solved using exportHTML and adding explicit inline styles if needed.
I think this can be solved by stripping out $convertSelectedLexicalContentToHtml
from the @lexical/clipboard
module and moving to a new package, @lexical/html
, that @lexical/clipboard
consumes.
$convertSelectedLexicalContentToHtml
and the related copy + paste HTML utilities can easily be repurposed for general exporting & importing HTML. And there's already a method on LexicalNodes
, exportHtml()
, that lets you define a special html output for a given node which is especially useful for DecoratorNodes
that use React for their rendering.
The following are proposed utility functions in the @lexical/html
package
$exportLexicalContentAsHtml(editor: Editor): string
$exportSelectedLexicalContentsAsHtml(editor: Editor, selection: Selection): string
$importHtmlAsLexicalContent(html: string, editor: Editor, selection: Selection): void
// This one I'm not sure about
$initializeEditorFromHTML(html: string): Editor
We could also add a .toHTML()
method to the Editor
that's similar to .toJSON()
, but that might encourage bad practices if we want people to persist the editor state as JSON and not HTML.
What are people's thoughts?
I think the package structure makes sense. The actual APIs could possibly be refined/consolidated a bit - not sure we need them all.
I think the big question here is the story around styles. With a lot of rich text features, we just append classes from the theme in the node logic. Do we just spit the HTML out with those classes? How do we preserve the styling information so that what gets rendered to HTML is actually the same as what was seen in the editor? There are several ways to think about this, probably. It can get more complicated with atomic style libraries.
I guess it's true that the user can always override exportHtml and apply inline styles in there, but that might cause other problems (like, do I need to extend every built-in node to apply my own styles)?
Also, is it always the case that the HTML we produce for the clipboard would be the same as what we would want to display in some other context? For instance, there are some special cases currently written into exportHtml methods in some nodes to accommodate the "idiosyncrasies" of other editors that it might be pasted into.
I think we can definitely do something here, just wondering how you're thinking about all this.
Do we allow changing default exportDOM
(it's the same question to importDOM
). It's easy to do for custom nodes, but an issue for built-ins like lists, tables, code blocks etc.
E.g. some would want to export codeblock highlight nodes with inline styles instead of classes to preserve colors on destination surface; or completely opposite scenario - strip highlight nodes and export codeblock with plain text inside only
Regarding API interface, as an idea: can make selection optional for both
$exportToHTML(editor: Editor, selection?: Selection): string
$importFromHTML(editor: Editor, selection?: Selection): void
@acywatson I'm not sure if we need to capture exact style information, unless I'm missing a use case.
I think the two main use cases for export/import HTML are:
I don't think people will export to HTML, render in the browser somehow, and expect to see the exact same thing on the screen- HTML is more of a generic vehicle for storage & translation of Lexical content & structure.
For cases where the styles tell a story (font-color, alignment, etc), or are semantically important, I'd expect them to be handled on a case-by-case basis in exportToHTML
and we can try to supply robust solutions for core nodes such as Links
, Tables
, etc and allow the use to extend if needed.
@acywatson I'm not sure if we need to capture exact style information, unless I'm missing a use case.
Well, this is probably uncommon, but sending an HTML email is one such case.
Do we allow changing default exportDOM (it's the same question to importDOM). It's easy to do for custom nodes, but an issue for built-ins like lists, tables, code blocks etc. E.g. some would want to export codeblock highlight nodes with inline styles instead of classes to preserve colors on destination surface; or completely opposite scenario - strip highlight nodes and export codeblock with plain text inside only
This speaks to my point above and was probably one advantage of the DOMConversions configuration approach that preceded the current Node-based design with exportDOM. It would be nice if I didn't have to extend every node. At that point, writing your own traversal/rendering algorithm might start to look simpler.
Just wondering if you guys have already looked over how Trix does it? I've worked with it a lot but never dug deeply into the code, but when combined with Rails' ActionText, it stores the content as HTML in the DB so it ends up looking like:
"<div class=\"trix-content\">\n Some example content goes in here.\n</div>\n"
This is pretty good as it allows us to control the styling of the outputted text from CSS via the .trix-content
class whilst retaining most of the user's desired changes.
As @tylerjbainbridge said, I don't think the expectation is like-for-like input/output from WYSIWYG editors (albeit the name somewhat indicates that), so it would be great if we could retain font/text changes (colour, italic, bold, etc) but leave more complicated layout to CSS.
Perhaps an API for doing both (exact styles & basic DOM) would work for most people?
Perhaps an API for doing both (exact styles & basic DOM) would work for most people?
I suspect this is true. We can start with something and refine it as necessary. I like @fantactuka 's ideas around the simpler API, but I think the overall approach @tylerjbainbridge is using makes sense.
Perhaps an API for doing both (exact styles & basic DOM) would work for most people?
I suspect this is true. We can start with something and refine it as necessary. I like @fantactuka 's ideas around the simpler API, but I think the overall approach @tylerjbainbridge is using makes sense.
+1 one the simpler APIs. Let's move things around into these new packages/function and we can always iterate on different styling options after.
I played a little bit with @lexical/clipboard
helper functions.
Here is my solution how to insert plain HTML into lexical editor:
import { $insertDataTransferForRichText } from "@lexical/clipboard";
function SetInitialHtmlValue({ value }) {
const [editor] = useLexicalComposerContext();
useEffect(() => {
editor.update(() => {
// Fake DataTransfer object which $insertDataTransferForRichText expects from clipboard
// https://developer.mozilla.org/en-US/docs/Web/API/DataTransfer
const fakeDataTransfer = {
getData(type) {
if (type === "text/html") {
return value;
}
return "";
},
};
const root = $getRoot();
// we need empty paragraph node to insert into it
const paragraphNode = $createParagraphNode();
root.append(paragraphNode);
// we need selection to point out where to insert our html;
const selection = $createRangeSelection();
selection.anchor.set(paragraphNode.getKey(), 0, "element");
selection.focus.set(paragraphNode.getKey(), 0, "element");
$insertDataTransferForRichText(fakeDataTransfer, selection, editor);
});
}, []);
return null;
}
// somewhere inside <LexicalComposer/>
<SetInitialHtmlValue value="<h1>Hello</h1>"/>
And for reading HTML i am using this construction:
editor.getRootElement().innerHTML
I played a little bit with @lexical/clipboard helper functions.
Unfortunately, the code does not work for my case(including img tag). Does anyone know how to hack to convert HTML string to editor-state? I'm stuck in the middle of developing of rich editor for my app.
What's your use case for storing and initializing editor state with HTML (instead of JSON)?
The official recommendation from the Lexical Team is to persist & initialize the editor via the .toJSON()
method. You can read more on that here.
Nonetheless, be assured that we're working on HTML export and import now.
I don't think you need a hack - what node are you using to represent the img tag in the EditorState?
That node should define an importDOM method that specifies how to construct the Node from the img HTML.
Look at how this works in other nodes:
static importDOM(): DOMConversionMap | null {
return {
ol: (node: Node) => ({
conversion: convertListNode,
priority: 0,
}),
ul: (node: Node) => ({
conversion: convertListNode,
priority: 0,
}),
};
}
You need to define a similar method that returns a converter for img:
static importDOM(): DOMConversionMap | null {
return {
img: (node: Node) => ({
conversion: convertImgNode,
priority: 0,
})
}
We do need better documentation around this.
@tylerjbainbridge
What's your use case for storing and initializing editor state with HTML (instead of JSON)?
I am replacing the editor library from slatejs to lexical. When using slatejs, I converted content to HTML string and saved it to my database. So my database already has a content column that contains HTML strings.
Reason
So I want to store content written lexical to my database as HTML strings.
@acywatson Thank you for the answer. I didn't know that. I copied ImageNode from lexical-playground. Maybe I'm missing something whit it.
Yea, you just need to define an importDOM method on the class to tell Lexical that you want to use this node type (ImageNode) to represent img tags pasted as HTML:
export class ImageNode extends DecoratorNode {
...
static importDOM(): DOMConversionMap | null {
return {
img: (node: Node) => ({
conversion: (domNode: Node) => {
const nodeName = domNode.nodeName.toLowerCase();
let node = null;
if (nodeName === 'img') {
node = $createImageNode(domNode.src, domNode.alt, {MAX_WIDTH});
}
return {node};
},
priority: 0,
})
}
...
}
I would like to second the exportHTML use-case for emails. We are also storing email templates in our db.
To complicate matters, these emails often have dynamic fields, for which we use handlebars. The complication is, we don't want to bother our clients with handlebar syntax, so in the editor view our customers would f.i., add a dynamic username field (node), and see 'username' (maybe with a background color to signify it is a dynamic field), but in the final email template it should read {{ user.name }}
.
It would be good if I can somehow add an exportHTML function to a custom node where I can do that transform, and let that function be called when the top-level htmlExport is called. (Sorry if I am not yet familiar with the lexical design and terminology.)
I tried to make something like this with Draftjs a while back, but had to abandon it, I couldn't get it to work. One learning from that attempt I could mention here, is that it is no problem to store the json representation along with the exported html in the db. So an importHTML is less important for my use case.
Until we have official API this is workaround that works for me, sharing this in case someone else find this useful. Also most importantly you can live with editor classnames in generated html
import React from 'react'
import { EditorState } from 'lexical';
import _ from "lodash";
import LexicalComposer from "@lexical/react/LexicalComposer";
import RichTextPlugin from "@lexical/react/LexicalRichTextPlugin";
import ContentEditable from "@lexical/react/LexicalContentEditable";
import { HistoryPlugin } from "@lexical/react/LexicalHistoryPlugin";
import AutoFocusPlugin from "@lexical/react/LexicalAutoFocusPlugin";
import { HeadingNode, QuoteNode } from "@lexical/rich-text";
import { ListItemNode, ListNode } from "@lexical/list";
import { AutoLinkNode, LinkNode } from "@lexical/link";
import LinkPlugin from "@lexical/react/LexicalLinkPlugin";
import ListPlugin from "@lexical/react/LexicalListPlugin";
import theme from 'src/Components/LexicalEditor/theme';
import TreeViewPlugin from 'src/Components/LexicalEditor/plugins/TreeViewPlugin';
import OnChangePlugin from '@lexical/react/LexicalOnChangePlugin'
import ToolbarPlugin from "./plugins/ToolbarPlugin";
import ListMaxIndentLevelPlugin from "./plugins/ListMaxIndentLevelPlugin";
import AutoLinkPlugin from "./plugins/AutoLinkPlugin";
import './styles.css'
const editorConfig = {
// The editor theme
theme,
// Handling of errors during update
onError(error) {
throw error;
},
// Any custom nodes go here
nodes: [
HeadingNode,
ListNode,
ListItemNode,
QuoteNode,
AutoLinkNode,
LinkNode
]
};
function Placeholder() {
return <div className="editor-placeholder">Enter some rich text...</div>;
}
export interface ILexicalEditorProps {
debug?: boolean;
initialReadOnly?: boolean;
initialValue: string;
onChange?: (value: string) => void;
}
interface ILexicalEditorState { }
class LexicalEditor extends React.Component<ILexicalEditorProps, ILexicalEditorState> {
editorInnerRef: React.RefObject<HTMLDivElement> = React.createRef<HTMLDivElement>();
// eslint-disable-next-line
public getHTML = (): string => {
if (this.editorInnerRef && this.editorInnerRef.current) {
const editorInner = this.editorInnerRef.current;
const editorInput = _.first(editorInner.children)
return editorInput?.innerHTML || '';
}
return ''
}
private onChangeLocal = (editorState: EditorState) => {
this.props.onChange?.(JSON.stringify(editorState.toJSON()))
}
render() {
const { props } = this;
return (
<LexicalComposer initialConfig={{ ...editorConfig, readOnly: props.initialReadOnly }}>
<div className="editor-container">
{props.initialReadOnly ? null : <ToolbarPlugin />}
<div className="editor-inner" ref={this.editorInnerRef}>
<RichTextPlugin
initialEditorState={props.initialValue}
contentEditable={<ContentEditable className="editor-input" />}
placeholder={<Placeholder />}
/>
<HistoryPlugin />
<AutoFocusPlugin />
<ListPlugin />
<LinkPlugin />
<AutoLinkPlugin />
<ListMaxIndentLevelPlugin
maxDepth={7}
/>
<OnChangePlugin
onChange={this.onChangeLocal}
/>
{props.debug && <TreeViewPlugin />}
</div>
</div>
</LexicalComposer>
);
}
}
export default LexicalEditor;
Now in parent you pass ref and show the html like this
<div
// @ts-ignore
dangerouslySetInnerHTML={{ __html: editorRef.current?.getHTML() }}
/>
Couldn't find a way to do this functional components. Let me know if there is a better way to do this.
This has been merged and resolved (#2246) and website documentation is coming soon in #2249! Thanks for your patience everyone who has asked for this.
Well, this is probably uncommon, but sending an HTML email is one such case.
@acywatson I have been browsing the related issues to this use-case e.g. #3042 (not 1:1), #2452 (again a convoluted example, but the element of generateNodesFromHtml(generateHtmlFromNodes)
not faithfully reproducing the state is here).
Was there ever a robust solution to this? All I want to do is input HTML, allow edits, output HTML, and send the email. I'm not sure I understand why I have to do so much leg work or monkey patching to do it. Seems you really understand this use-case so figured I would ping you directly.
I addressed this elsewhere, but fundamentally the conversion between HTML and Lexical formats is lossy. Maybe I can think of some way to make it less lossy, but usually as soon as I do something like that, it breaks someones use case, which is why we opted for maximum flexibility here. You can make Lexical export effectively any HTML you want by configuring exportDOM on the nodes.
Once again, I will see if we can find a better way to do this.
Hi, This is what I do toHTML(), and it works well for me:
import { EditorState, SerializedLexicalNode, SerializedParagraphNode, SerializedTabNode, SerializedTextNode } from "lexical";
import { SerializedImageNode } from "./nodes/ImageNode";
import { SerializedLinkNode } from "@lexical/link";
import { SerializedCodeNode } from "@lexical/code";
import { SerializedHeadingNode, SerializedQuoteNode } from "@lexical/rich-text";
import { ListType, SerializedListItemNode, SerializedListNode } from "@lexical/list";
import { SerializedTableCellNode, SerializedTableNode, SerializedTableRowNode } from "@lexical/table";
import escapeText from "@/utils/escapeText";
async function toHTML(state: EditorState): Promise<string> {
const es = state.toJSON() // _nodeMap.get('root')
// if (!root) {
// return ''
// }
const root = es.root
let html = ''
for (let node of root.children) {
html += await dumpNode(node)
}
return html
}
async function dumpNode(node: SerializedLexicalNode): Promise<string> {
// console.log('node:', node)
switch (node.type) {
case 'paragraph':
return dumpParagraph(node as SerializedParagraphNode)
case 'text':
return dumpText(node as SerializedTextNode)
case 'tab':
return dumpTab(node as SerializedTabNode)
case 'code-highlight':
return dumpCodeText(node)
case 'linebreak':
return '<br />'
case 'list':
return dumpList(node as SerializedListNode)
case 'tablecell':
return dumpTableCell(node as SerializedTableCellNode)
case 'tablerow':
return dumpTableRow(node as SerializedTableRowNode)
case 'table':
return dumpTable(node as SerializedTableNode)
case 'quote':
return dumpQuote(node as SerializedQuoteNode)
case 'heading':
// tag
return dumpHeading(node as SerializedHeadingNode)
case 'code':
return dumpCode(node as SerializedCodeNode)
case 'autolink':
case 'link':
return dumpLink(node as SerializedLinkNode)
case 'image':
return dumpImage(node as SerializedImageNode)
}
return ''
}
async function dumpTableCell(node: SerializedTableCellNode): Promise<string> {
let content = ''
for (let child of node.children) {
content += await dumpNode(child)
}
if (node.headerState === 0) {
return `<td>${content}</td>\n`
}
return `<th>${content}</th>\n`
}
async function dumpTableRow(node: SerializedTableRowNode): Promise<string> {
let content = ''
for (let child of node.children) {
content += await dumpNode(child)
}
return `<tr>${content}</tr>\n`
}
async function dumpTable(node: SerializedTableNode): Promise<string> {
let content = ''
for (let child of node.children) {
content += await dumpNode(child)
}
return `<table>${content}</table>\n`
}
async function dumpListItem(node: SerializedListItemNode, listType: ListType): Promise<string> {
let text = ''
for (let child of node.children) {
text += await dumpNode(child)
}
if(listType === 'check') {
return `<li value="${node.value}" role="checkbox" aria-checked="${!!node.checked}" ${node.checked ? 'class="checked"' : ''}>${text}</li>\n`
}
return `<li value="${node.value}">${text}</li>`
}
async function dumpList(node: SerializedListNode): Promise<string> {
let text = ''
for (let child of node.children) {
text += await dumpListItem(child as SerializedListItemNode, node.listType)
}
switch(node.listType) {
case 'bullet':
return `<ul>${text}</ul>\n`
case 'check':
return `<ul>${text}</ul>\n`
case 'number':
return `<ol>${text}</ol>\n`
}
// unreachable
throw new Error('invalid listType: ' + node.listType)
}
async function dumpQuote(node: SerializedQuoteNode): Promise<string> {
let text = ''
for (let child of node.children) {
text += await dumpNode(child)
}
return `<blockquote>${text}</blockquote>\n`
}
async function dumpHeading(node: SerializedHeadingNode): Promise<string> {
let text = ''
for (let child of node.children) {
text += await dumpNode(child)
}
return `<${node.tag}>${text}</${node.tag}>\n`
}
//
async function dumpCodeText(node: any): Promise<string> {
const text = escapeText(node.text)
return `<span${node.highlightType ? ' class="' + 'token-' + node.highlightType + '"' : ''}>${text}</span>`
}
async function dumpImage(node: SerializedImageNode): Promise<string> {
let src = node.src
if (src.startsWith('data:image')) {
// we upload image to server
const resp = await fetch('/api/upload', {
method: 'POST',
credentials: "include",
body: JSON.stringify({src: src}),
})
const data = await resp.json();
if (data.code !== 200) {
throw new Error(data.message);
}
console.log('data.src:', data.data.src)
src = data.data.src
}
return `<img src="${src}" ${node.width ? 'width="'+node.width + '"' : ''} ${node.height ? 'height="' + node.height + '"' : ''} alt="${node.altText}" max-width="${node.maxWidth}">\n`
}
async function dumpVideo(node: any) {
}
async function dumpParagraph(node: SerializedParagraphNode): Promise<string> {
let html = ''
for (let child of node.children) {
html += await dumpNode(child)
}
if (node.format === '' && node.indent === 0) {
return '<p>' + html + '</p>\n'
}
let styles = ''
if (node.indent !== 0) {
styles += 'padding-inline-start: calc(40px);'
}
if (node.format !== '') {
styles += 'text-align: ' + node.format + ';'
}
return `<p style="${styles}">${html}</p>\n`
}
async function dumpLink(node: SerializedLinkNode): Promise<string> {
let text = ''
for (let child of node.children) {
text += await dumpNode(child)
}
return `<a href="${node.url}" target="_blank">${text}</a>`
}
// type: code
async function dumpCode(node: SerializedCodeNode): Promise<string> {
let code = ''
for (let child of node.children) {
code += await dumpNode(child)
}
return `<pre ${node.language ? 'data-highlight-language="' + node.language + '"': ''}>${code}</pre>`
}
async function dumpTab(node: SerializedTabNode): Promise<string> {
return '<span> </span>'
}
async function dumpText(node: SerializedTextNode): Promise<string> {
let html = escapeText(node.text),
attrs = []
// klass = []
const format = node.format
if (format & 0x01) {
// bold
attrs.push('strong')
}
if (format & 0x02) {
// italic
attrs.push('em')
}
if (format & 0x04) {
// strikethrough
attrs.push('s')
}
if (format & 0x08) {
// underline
attrs.push('u')
}
if (format & 0x10) {
// code
attrs.push('code')
}
if (format & 0x20) {
// subscript
attrs.push('sub')
}
if (format & 0x40) {
// superscript
attrs.push('sup')
}
if (format & 0x80) {
}
if (format & 0x100) {
}
if (format & 0x200) {
}
for (let attr of attrs) {
html = '<' + attr + '>' + html + '</' + attr + '>'
}
// if (klass.length > 0) {
// html = '<span class="' + klass.join(' ') + '">' + html + '</span>'
// }
return html
}
export default toHTML
This is escapeText:
// code from escape-html
const matchHtmlRegExp = /["'&<>]/;
function escapeText(text: string) {
let str = "" + text;
let match = matchHtmlRegExp.exec(str);
// console.log('text:', text, 'match:', match)
if (!match) {
return str;
}
let escape;
let html = "";
let index = 0;
let lastIndex = 0;
for (index = match.index; index < str.length; index++) {
switch (str.charCodeAt(index)) {
case 34: // "
escape = """;
break;
case 38: // &
escape = "&";
break;
case 39: // '
escape = "'";
break;
case 60: // <
escape = "<";
break;
case 62: // >
escape = ">";
break;
default:
continue;
}
if (lastIndex !== index) {
html += str.substring(lastIndex, index);
}
lastIndex = index + 1;
html += escape;
}
return lastIndex !== index ? html + str.substring(lastIndex, index) : html;
}
export default escapeText
Hi, This is what I do toHTML(), and it works well for me:
... export default toHTML
This is really similar to what we do internally.
if lexical add an following API for Node:
toHtml(): string
I thinks it is very great.
There seems to be some good demand in utilities to import and export from HTML. This comes from developers who either store it as HTML on the backend already (coming from other libraries) or want it to render.
Leaving aside that HTML is not as effective as EditorState I think we should explore the possibilities to enable such API. In a way we already have this in the @lexical/clipboard, it's just a matter of abstracting it in a way and ergonomic way that other developers can use as well as the copy-paste functionality.