wordpress-mobile / AztecEditor-Android

A reusable native Android rich text editor component.
Mozilla Public License 2.0
698 stars 115 forks source link

HTML parsing inconsistency with <div> #434

Open 0nko opened 7 years ago

0nko commented 7 years ago

Parsing the following HTML gives an incorrect output HTML:

<h3><strong>Shipped:</strong></h3><br>
<div class="entry-content">
    <br>
    <ul>
        <li style="list-style: none"><br></li>
        <li>
            <strong>Something:</strong> Text Here (<a href="https://wordpress.com">#</a>).
        </li>
        <li style="list-style: none"><br></li>
        <li>
            <strong>Title:</strong> List Item <a href="https://wordpress.com">link</a>.
        </li>
        <li style="list-style: none"><br></li>
        <li>
            <strong>Some bold text:</strong> plain here, <a href="https://wordpress.com">and a link!</a>.
        </li>
        <li style="list-style: none"><br></li>
    </ul><br>
    <h3><strong>Another list:</strong></h3><br>
    <ul>
        <li style="list-style: none"><br>
        <br></li>
        <li>
            <strong>Another Item</strong>Another <a href="https://wordpress.com">link</a>
        </li>
        <li style="list-style: none"><br></li>
    </ul><br>
</div><br>
#yolo

Expected

The same as input.

Observed

<h3><b>Shipped:
<div class="entry-content"></div>
</b></h3>
<ul>
    <li style="list-style: none"></li>
    <li>            <b>Something:</b> Text Here (<a href="https://wordpress.com">#</a>).</li>
    <li style="list-style: none"></li>
    <li>            <b>Title:</b> List Item <a href="https://wordpress.com">link</a>.</li>
    <li style="list-style: none"></li>
    <li>            <b>Some bold text:</b> plain here, <a href="https://wordpress.com">and a link!</a>.</li>
    <li style="list-style: none"></li>
</ul>
<h3><b>Another list:</b></h3>
<ul>
    <li style="list-style: none"></li>
    <li>            <b>Another Item</b>Another <a href="https://wordpress.com">link</a></li>
    <li style="list-style: none"></li>
</ul>
#yolo

Reproduced

  1. Paste the original code in the HTML mode
  2. Switch to visual mode
  3. Switch back to HTML
  4. Notice the output is not the same as input

Tested

Emulator on Android 7.1.1 with 1.0-beta.6

maxme commented 6 years ago

We should have a unit test to reproduce that one - cc @rachelmcr

rachelmcr commented 6 years ago

I can't reproduce the output in the original report — instead, the div tags are removed entirely with this output:

<h3><b>Shipped:</b></h3>
<br>
<br>
<ul>
 <li style="list-style: none"></li>
 <li><b>Something:</b> Text Here (<a href="https://wordpress.com">#</a>).</li>
 <li style="list-style: none"></li>
 <li><b>Title:</b> List Item <a href="https://wordpress.com">link</a>.</li>
 <li style="list-style: none"></li>
 <li><b>Some bold text:</b> plain here, <a href="https://wordpress.com">and a link!</a>.</li>
 <li style="list-style: none"></li>
</ul>
<br>
<h3><b>Another list:</b></h3>
<br>
<ul>
 <li style="list-style: none"></li>
 <li><b>Another Item</b>Another <a href="https://wordpress.com">link</a></li>
 <li style="list-style: none"></li>
</ul>
<br>
<br>
#yolo

While testing the issue I found two div parsing issues:

1) div tags are removed (as above) when a br tag precedes it.

Input: <br><div><br></div> Output: <br><br>

2) div tags are moved inside a list item when there is a list inside the div.

Input: <div><ul><li>Unordered</li></ul></div> Output: <ul><li><div>Unordered</div></li></ul>

I'll add unit tests for both of those cases.