slab / quill

Quill is a modern WYSIWYG editor built for compatibility and extensibility
https://quilljs.com
BSD 3-Clause "New" or "Revised" License
42.77k stars 3.33k forks source link

Several issues with getSemanticHTML not preserving html represented in editor #4289

Open enzedonline opened 1 month ago

enzedonline commented 1 month ago

Quill documentation describes getSemanticHTML as:

Get the HTML representation of the editor contents. This method is useful for exporting the contents of the editor in a format that can be used in other applications.

It's should be a useable HTML representation of the editor contents. Critical requirement for using Quill as a form widget.

What happens is that the HTML is not preserved for syntax, video, formula blocks or check lists. Of those, only the syntax block is recoverable (by reapplying highlightjs on render), information necessary for video and formula are lost while check list requires some wrangling in javascript on render.

Syntax block:

The syntax highlighting markup is stripped out. The code block is instead just wrapped by a <pre> tag.

Editor:

<div class="ql-code-block-container" spellcheck="false">
    <select class="ql-ui" contenteditable="false">
        ....
    </select>
    <div class="ql-code-block" data-language="python"><span class="ql-token hljs-keyword">def</span> <span
            class="ql-token hljs-title">is_absolute_url</span>(<span class="ql-token hljs-params">url</span>):</div>
    <div class="ql-code-block" data-language="python"> parsed_url = urlparse(url)</div>
    <div class="ql-code-block" data-language="python"> <span class="ql-token hljs-keyword">return</span> <span
            class="ql-token hljs-built_in">bool</span>(parsed_url.scheme <span class="ql-token hljs-keyword">and</span>
        parsed_url.netloc)</div>
</div>

getSemanticHTML:

<pre data-language="python">
def is_absolute_url(url):
    parsed_url = urlparse(url)
    return bool(parsed_url.scheme and parsed_url.netloc)
</pre>

At the very least, this should be <pre><code class="language-${data-language-value}">...</code></pre> otherwise this is just rendered as plain text with whitespace preserved ... but why strip out the formatting? This means highlight.js needed to be reapplied on each render.

Video block

iframes inserted from Quill video block are stripped and replaced by a hyperlink.

Editor:

<iframe class="ql-video" frameborder="0" allowfullscreen="true" class="ql-iframe-align-right"
    height="270" width="542"
    src="https://www.youtube.com/embed/2o0zV4VOQ54?showinfo=0" 
></iframe>

getSemanticHTML:

<a href="https://www.youtube.com/embed/2o0zV4VOQ54?showinfo=0" target="_blank" rel="nofollow 
noopener">https://www.youtube.com/embed/2o0zV4VOQ54?showinfo=0</a>

Playground example.

The iframe needs to be preserved along with all attributes.

Formula block:

The katex markup is stripped out and replaced with a plain text span:

Editor:

<p>
    <span class="ql-formula" data-value="y=x^2">
        <span contenteditable="false">
            <span class="katex">
                <span class="katex-mathml">
                    <math xmlns="http://www.w3.org/1998/Math/MathML">
                        <semantics>
                            <mrow><mi>y</mi><mo>=</mo><msup><mi>x</mi><mn>2</mn></msup></mrow>
                            <annotation encoding="application/x-tex">y=x^2</annotation>
                        </semantics>
                    </math>
                </span>
                <span class="katex-html" aria-hidden="true">
                    <span class="base">
                        <span class="strut" style="height: 0.625em; vertical-align: -0.1944em;"></span>
                        <span class="mord mathnormal" style="margin-right: 0.0359em;">y</span>
                        <span class="mspace" style="margin-right: 0.2778em;"></span><span class="mrel">=</span>
                        <span class="mspace" style="margin-right: 0.2778em;"></span>
                    </span>
                    <span class="base">
                        <span class="strut" style="height: 0.8141em;"></span>
                        <span class="mord"><span class="mord mathnormal">x</span>
                        <span class="msupsub">
                            <span class="vlist-t">
                                <span class="vlist-r">
                                    <span class="vlist" style="height: 0.8141em;">
                                        <span class="" style="top: -3.063em; margin-right: 0.05em;">
                                            <span class="pstrut" style="height: 2.7em;"></span>
                                            <span class="sizing reset-size6 size3 mtight">
                                                <span class="mord mtight">2</span>
                                            </span>
                                        </span>
                                    </span>
                                </span>
                            </span>
                        </span>
                    </span>
                </span>
            </span>
        </span>
    </span>
</span> 
</p>

getSematicHTML:

<p>
    <span>y=x^2</span> 
</p>

katex markup should be preserved. At the very least, some identifier that this is a Quill formula block so that katex can be applied on render (this is not a favourable solution though).

Check lists

A single check list is converted to one unordered list per list item.

Editor:

<ol>
    <li data-list="unchecked"><span class="ql-ui" contenteditable="false"></span>one</li>
    <li data-list="checked"><span class="ql-ui" contenteditable="false"></span>two</li>
    <li data-list="unchecked"><span class="ql-ui" contenteditable="false"></span>three</li>
</ol>

getSemanticHTML:

<ul><li data-list="unchecked">one</li></ul>
<ul><li data-list="checked">two</li></ul>
<ul><li data-list="unchecked">three</li></ul>

This needs to be preserved as a single unordered list.

raffaele-clevermind commented 1 month ago

I'm having problems too, when using text-align in the list format, in the semantic version the align is removed

Editor:

<ol>
  <li data-list="bullet" style="text-align: center;"><span class="ql-ui" contenteditable="false"></span>one</li>
  <li data-list="bullet" style="text-align: center;"><span class="ql-ui" contenteditable="false"></span>two</li>
  <li data-list="bullet" style="text-align: center;"><span class="ql-ui" contenteditable="false"></span>three</li>
</ol>

getSemanticHTML:

<ul>
  <li>one</li>
  <li>two</li>
  <li>three</li>
</ul>

This causes the list alignment to not be maintained if the HTML is exported to be used somewhere else

There is also a partial fix open at the moment

4273

medi6 commented 3 weeks ago

Hi, you can, temporary, fix LI display using this less code. But, you definitly loose alignment...


        padding-left: 21px;
        li {
            >ol, >ul {
                padding-left: 42px;
            }
            padding-left: 21px;
            list-style-type: none;            
            &:before {
                display: inline-block;
                margin-left: -21px;
                margin-right: 4px;
                text-align: right;
                white-space: nowrap;
                width: 17px;
                content:'\2022';
            }            
        }        
    }  
    ul {
        li {
            &:before {
                content:'\2022';
            }
        }
    }
    .ms-pub-body>ol {
        counter-reset: ol1;
        >li {
            counter-increment: ol1;
            &:before {
                content:counter(ol1, decimal) '. '
            }            
            >ol {
                counter-reset: ol2;
                >li {
                    counter-increment: ol2;
                    &:before {
                        content:counter(ol2, lower-alpha) '. ';
                        margin-right: 2px;
                        width: 19px;                        
                    }            
                    >ol {
                        counter-reset: ol3;
                        >li {
                            counter-increment: ol3;
                            &:before {
                                content:counter(ol3, lower-roman) '. ';
                                margin-right: 2px;
                                width: 19px;  
                            }    
                            >ol {
                                counter-reset: ol4;
                                >li {
                                    counter-increment: ol4;
                                    &:before {
                                        content:counter(ol4, decimal) '. '
                                    }            
                                    >ol {
                                        counter-reset: ol5;
                                        >li {
                                            counter-increment: ol5;
                                            &:before {
                                                content:counter(ol5, lower-alpha) '. '
                                            }            
                                            >ol {
                                                counter-reset: ol6;
                                                >li {
                                                    counter-increment: ol6;
                                                    &:before {
                                                        content:counter(ol6, lower-roman) '. ';
                                                        margin-right: 2px;
                                                        width: 19px;  
                                                    }      
                                                    >ol {
                                                        counter-reset: ol7;
                                                        >li {
                                                            counter-increment: ol7;
                                                            &:before {
                                                                content:counter(ol7, decimal) '. '
                                                            }    
                                                            >ol {
                                                                counter-reset: ol8;
                                                                >li {
                                                                    counter-increment: ol8;
                                                                    &:before {
                                                                        content:counter(ol8, lower-alpha) '. '
                                                                    }            
                                                                    >ol {
                                                                        counter-reset: ol9;
                                                                        >li {
                                                                            counter-increment: ol9;
                                                                            &:before {
                                                                                content:counter(ol9, lower-roman) '. ';
                                                                                margin-right: 2px;
                                                                                width: 19px;  
                                                                            }            
                                                                        }
                                                                    }                                                                        
                                                                }
                                                            }                                                                      
                                                        }
                                                    }                                                            
                                                }
                                            }                                                
                                        }
                                    }                                        
                                }
                            }                                    
                        }
                    }
                }
            }
        }
    }`
markuso commented 2 weeks ago

I had the same issue with the Video block, which I managed to find out why and solved it locally without waiting for Quill to make adjustments to how they generate the html from the getSemanticHTML() call. It is worth fixing in core, for sure, but in many cases, we need certain things to work differently than the default anyway.

Below is what the current Video block class looks like in the Quill package at file the location quill/formats/video.js.

class Video extends BlockEmbed {
  static blotName = 'video';
  static className = 'ql-video';
  static tagName = 'IFRAME';
  static create(value) {
    ...
  }
  static formats(domNode) {
    ...
  }
  static sanitize(url) {
    ...
  }
  static value(domNode) {
    return domNode.getAttribute('src');
  }
  format(name, value) {
    ...
  }
  html() {
    const {
      video
    } = this.value();
    return `<a href="${video}">${video}</a>`;
  }
}

You will notice that the class above has an instance method of html() that just returns a hyperlink rather than the actual video iframe code block. It is only using the URL of the video to make a hyperlink when it converts it to semantic html.

To change this, I created my own class and named it VideoBlock (but you can name it anything) that extends the original Video class and used it in my setup rather than the original. Below is the simple override of the html() method, and you may not need to override anything else on that class, unless you want to.

class VideoBlock extends Video {
  html () {
    return this.domNode.outerHTML;
  }
}

The above return this.domNode.outerHTML; line is what will return the actual block's html code intact rather than changing it for a basic link for some strange reason. I normally try to make my own class blots, even for built-in ones, as I normally need to override something about the behavior. For example, at times I need to allow the style attribute to be used and not stripped out.

I hope this helps someone out there. I believe that the same thing may be applied to some of the other Quill blocks mentioned in this issue by @enzedonline.