wordpress-mobile / AztecEditor-Android

A reusable native Android rich text editor component.
Mozilla Public License 2.0
675 stars 112 forks source link

Gutenberg block plugin fix #666

Closed 0nko closed 6 years ago

0nko commented 6 years ago

Fixes #659.

I was curious how difficult it would be to parse GB comments as a block. I had to add a new plugin type IBlockSpanHandler because we only had one for inline spans.

There were a couple of issues along the way, like broken nesting because of extra <html> and <body> being inserted by the TagSoup parser. But it seems like this could be the right approach and the result looks promising.

gb_block

daniloercoli commented 6 years ago

I tried this PR with a simple GB paragraph, and then added a picture to it from Aztec, ending up on having 2 pictures added to the editor.

Steps to repro:

It added 2 pictures instead of one. The first picture is inside the <!-- /wp:paragraph --> GB comment. screen shot 2018-05-07 at 18 22 24

mzorz commented 6 years ago

Been able to repro the same issue as described in @daniloercoli 's comment here on different OS versions. Furthermore and to narrow down, this issue seems to happen only when you add an image at the end of the last paragraph. It doesn't fail like that when there’s more content and you try to insert at the end of the first paragraph. For example, using this code:

        var GBTEST =
                "<!-- wp:paragraph -->" +
                        "<p>This is a paragraph</p>" +
                        "<!-- /wp:paragraph -->" +
        "<!-- wp:paragraph -->" +
        "<p>This is a paragraph2</p>" +
        "<!-- /wp:paragraph -->"// +

if you place the cursor at the end of the first sentence and insert an image, checking the HTML editor shows this:

screen shot 2018-05-07 at 13 57 27

Then, inserting an image at the end of the last (second) paragraph makes it duplicate the image when converting to HTML:

screen shot 2018-05-07 at 13 59 33

Just inserting an image at the end of the last paragraph makes it duplicate.

Another thing observed is that placing the cursor there at the end is not remembered well when switching to/from HTML mode:

  1. open the app with these 2 paragraphs as shown in this comment
  2. place the cursor at the end of the first paragraph
  3. switch to HTML mode and observe the cursor is there at the end of the first paragraph
  4. switch back to visual mode, and observe the cursor place is preserved
  5. now place the cursor at the end of the second (last) sentence
  6. switch to HTML mode: observe the placement of cursor is not preserved

gutencursor

mzorz commented 6 years ago

Explanation of the double image issue

see description here above According to my tests this was happening as a span was being not set correctly because start and end spans for the GB comment and the included content need to start/end at the same nesting level. While the nesting level concept doesn't really exist in spans, it does matter that both the <!-- wp:paragraph> comment and the actual paragraph <p> are represented by IAztecBlockSpan types (namely both classes GutenbergCommentSpan and ParagraphSpan are subclasses of IAztecBlockSpan), and we have no way of knowing at any given position where 2 spans are set, which span is set first, as both are at the same position in the Spanned, but rather we have this information (nestingLevel) elsewhere. We were looking for firstOrNull and thus the start/end spans were not coming in order when inserting an image.

Fix

I managed to patch that by looking into the nesting level and only considering a match when the same nestingLevel is reached as well, but even with that patch (and unrelated to it), there are other multiple problems that the span-based approach doesn't handle well (or better put, is not designed to handle), involving the inability to pair and tie 2 opening [HTML|comment] and 2 closing [HTML|comment] tags together, which is ultimately what is causing errors when converting back to HTML and opening the content in Gutenberg. Here are a couple of examples of things that are wrong and will be quite difficult to keep up with:

Example 1

Putting the cursor at the beginning of the example text (if you switch to HTML you'll see the cursor is at the beginning of the text but after both the GB delimiter and the <p> opening tag) then inserting an image produces the following output HTML:

screen shot 2018-05-08 at 12 52 43

(the image is inserted after the starting GB delimiter, but before the opening <p> tag, thus making this paragraph block have 2 root tags which doesn't work well as of today.

Example 2

Inserting an image in the last paragraph, then switching to HTML, then deleting all traces of that image, leaving the HTML code "as it was originally". Then back to visual mode and insert the image again, produces the following output:

screen shot 2018-05-08 at 13 05 17

As you can see, the GB delimiter is now gone.

Thoughts

I think we should go with the "pre/post process and tie tags" approach Danilo made (https://github.com/wordpress-mobile/AztecEditor-Android/pull/665), here's why: handling Gutenberg comments as blocks seems like a good idea but in order to prevent breaking the blocks as per how Gutenberg handles them today we also need to tie them to the opening/closing tags, for example there can't be anything in between the closing </p></we:paragraph> tags, which is a specific case my first attempt was addressing successfully (https://github.com/wordpress-mobile/AztecEditor-Android/pull/664). By using spans only, we can't handle the possibility that the user places the cursor wherever they want and end up inserting stuff where they shouldn't.

Therefore, I think hiding the GB delimiters in the known last opening / closing HTML tag and then bloating them again once converting back from visual to html is IMO way safer.

hypest commented 6 years ago

I noticed that the double-image issue mentioned by @daniloercoli (and @mzorz ) is actually an "issue" present on develop too.

What happens is that when adding the image at text buffer end, the code appends the <img> outside the <p>, but inserts it normally inside the <p> if the insertion position is not at the end of text/buffer.

I think that we need tackle this root cause first, by making sure the <img> is always inserted inside the <p></p> since the buffer end position should be considered part of the <p></p> span.

daniloercoli commented 6 years ago

I noticed that the double-image issue mentioned by @daniloercoli (and @mzorz ) is actually an "issue" present on develop too.

That's weird, because today I wrote a test for this and there wasn't a problem running it on develop.

hypest commented 6 years ago

Opened a PR with a couple of fixes. Not sure if more issues are still existing though.

mzorz commented 6 years ago

I've been performing tests integrating with WPAndroid - we at least don’t seem to be breaking Gutenberg posts now, which is great! 🎉

Interesting findings here:

  1. there are Classic blocks appearing there where in Aztec I inserted new lines in the middle of an existing block, that was converted to <p></p> This is harmless IMO, and also something that seems to be new in Gutenberg web (didn’t observe this happening in previous versions of Gutenberg)

  2. opening a Post in Aztec and then just leaving Aztec will produce slight variations - none of them break GB per se, but produce a new version of the Post that WPAndroid then uploads.

While this is not the best UX, I think it's "compatible enough" for now. A note on this, once the post goes through Aztec once, it doesn't seem to update itself unpurposedly if you open and edit on Gutenberg web afterwards (and open back in WPAndroid again). So I think this is OK to bear with.

Some of the differences found:

screen shot 2018-05-10 at 11 15 51 screen shot 2018-05-10 at 11 17 07

That is, a space is trimmed (first case), and a special character is added in the second case. This latter case might be worth looking into and further follow up, but I don't think (as per my tests) it's going to happen on every post, I'd say it's probably not the highest priority.

Attaching the files with content before / after processing here so you can diff locally and see.

post-afterAztec.html.txt post-original.html.txt

(renamed to .txt so I could attach them here)

Looking forward to merge if you agree @daniloercoli @hypest

daniloercoli commented 6 years ago

Merging this one then :shipit:

Mario, when you have the time would you mind to open new tickets for the issues you're reported above?

mzorz commented 6 years ago
  1. opening a Post in Aztec and then just leaving Aztec will produce slight variations - none of them break GB per se, but produce a new version of the Post that WPAndroid then uploads.

Was going to track this one but @daniloercoli found the mother issue already tracked in #136 🙇