backdrop / backdrop-issues

Issue tracker for Backdrop core.
144 stars 40 forks source link

Fixed: Summary text exists, but is empty when using CKeditor5. #6423

Closed leeksoup closed 6 months ago

leeksoup commented 8 months ago

Description of the bug

As I have been updating content on my Backdrop site (which I upgraded from D7) I noticed that some of the nodes' Teaser / trimmed displays went blank. See the issue here at Backdrop Forum for more: https://forum.backdropcms.org/forum/how-fix-problem-trimmed-output

I had not changed the actual content of the fields in question.

Steps To Reproduce

To reproduce the behavior:

  1. Go to 'Execute PHP'
  2. Run this code:
    // Set $item to something in excess of 600 characters, e.g.
    $item = '<p>
    Fusce vulputate eleifend sapien. Donec vitae orci sed dolor rutrum auctor. Nunc interdum lacus sit amet orci. Nullam tincidunt adipiscing enim. Nulla neque dolor, sagittis eget, iaculis quis, molestie non, velit. Sed lectus. Phasellus blandit leo ut odio. Etiam feugiat lorem non metus. Aliquam erat volutpat. Nullam tincidunt adipiscing enim. Ut a nisl id ante tempus hendrerit. Vestibulum fringilla pede sit amet augue. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Suspendisse potenti. Sed augue ipsum, egestas nec, vestibulum et, malesuada adipiscing, dui.Aenean imperdiet. Fusce convallis metus id felis luctus adipiscing. Pellentesque libero tortor, tincidunt et, tincidunt eget, semper nec, quam. Ut id nisl quis enim dignissim sagittis. Duis leo.</p>';
    // 1 is the "Filtered HTML" format on my site, settings below
    dpm(text_summary($item, 1));

Actual behavior

The summary produces nothing, or more precisely: <p>&#13;</p>

Expected behavior

It should produce something like this:

<p>&#13;
    Fusce vulputate eleifend sapien. Donec vitae orci sed dolor rutrum auctor. Nunc interdum lacus sit amet orci. Nullam tincidunt adipiscing enim. Nulla neque dolor, sagittis eget, iaculis quis, molestie non, velit. Sed lectus. Phasellus blandit leo ut odio. Etiam feugiat lorem non metus. Aliquam erat volutpat. Nullam tincidunt adipiscing enim. Ut a nisl id ante tempus hendrerit. Vestibulum fringilla pede sit amet augue. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Suspendisse potenti.</p>

Additional information

dpm on "Filtered HTML" format:

... (Object) stdClass
format (String, 1 characters ) 1
name (String, 13 characters ) Filtered HTML
cache (Boolean) TRUE
status (String, 1 characters ) 1
weight (String, 2 characters ) -1
roles (Array, 4 elements)
filters (Array, 7 elements)
    filter_html_escape (Object) stdClass
    filter_url (Object) stdClass
    filter_html (Object) stdClass
    filter_autop (Object) stdClass
        status (Integer) 0
        weight (Integer) 2
        module (String, 6 characters ) filter
        settings (Array, 0 elements)
        name (String, 12 characters ) filter_autop
    filter_image_caption (Object) stdClass
    filter_image_align (Object) stdClass
    filter_htmlcorrector (Object) stdClass
editor (String, 9 characters ) ckeditor5
editor_settings (Array, 5 elements)

I narrowed down the bug to lines 469-471 in core/modules/field/modules/text/text.module

  if (isset($format->filters['filter_autop'])) {
   $line_breaks["\n"] = 1;
 }

If I comment these out, the summary function works correctly. So I tried disabling "Convert line breaks into HTML (i.e. <br> and <p>)" in my Filtered HTML format without any change in the behavior.

I realized that one problem is that, regardless of whether 'filter_autop' is enabled, the value of $format->filters['filter_autop'] will be "set" as far as PHP is concerned. Also, $format->filters['filter_autop']->status should be checked to see that its value is non-zero (i.e. "enabled").

(A possible second issue is that when using the CKE plugin or any editor plugin that saves line breaks in the HTML, filter_autop should not be enabled since the text editor will take care of converting line breaks.)

Add any other information that could help, such as:

Edited for clarity and to mark tags as code.

leeksoup commented 8 months ago

I was able to fix the bug with the following minor code change to line 469:

  // Newline only indicates a line break if line break converter
  // filter is enabled.
  if (isset($format->filters['filter_autop']) && ($format->filters['filter_autop']->status != 0)) {
    $line_breaks["\n"] = 1;
  }
indigoxela commented 8 months ago

@leeksoup many thanks for taking the time to report :+1:

The faulty output rings a bell, though.

Can you have a look at this existing issue, please? If it turns out that the other issue covers your problem, please provide feedback there. If not, we'll keep on discussing here.

leeksoup commented 8 months ago

@indigoxela sure. :)

No, that is a separate issue. I do see some cruft from markup, but that problem wasn't my concern here.

indigoxela commented 8 months ago

I do see some cruft from markup, but that problem wasn't my concern here.

OK, not your main problem, but still something to consider (but unrelated).

I played with your sample content, with filter_autop ("Convert line breaks into HTML") on and off, and it didn't make any difference for me. But I can confirm, that the automatic summary fails, if the text to truncate is just one long item - longer than the limit, but without any paragraph or line break (<p> <br> \n).

And that summary creation fails with both display settings (summary or trimmed / only trimmed). Also regardless of the length setting. And also if all filters are off.

As soon as I stray in some '<br>' or even just an plain newline (not markup) into that long text, summary creation works. Regardless of the format. What's relevant here is the text to summarize.

indigoxela commented 8 months ago

Function text_summary() scans the text for some "breakpoint character", but seems to fail for this (and similar) text, it seems). And it has to do with the way CKEditor5 (more precisely our integration) formats markup. Because when I remove the white space at the beginning (line break and indent), summaries work again.

That's why summaries work with content edited in TinyMCE, which doesn't do that reformatting / indentation of markup.

@quicksketch last dev meeting you mentioned, that it's stunning, how few reports we have re CKE5 integration. Well, here's one now. :stuck_out_tongue_winking_eye: The CKE5 markup formatter breaks text_summary() under circumstances. Injected newlines (from formatting) can prevent proper summary creation.

leeksoup commented 8 months ago

@indigoxela - I am not sure I understand. Are you saying that with existing code, filter_autop on/off doesn't make any difference? If so, yes, I agree. The code isn't properly checking if that filter is enabled. (i.e. isset($format->filters['filter_autop'] will return true) regardless of whether the "Convert line breaks" filter is on or off. Hence my suggested code fix.

But I can confirm, that the automatic summary fails, if the text to truncate is just one long item - longer than the limit, but without any paragraph or line break (<p> <br> \n).

And that summary creation fails with both display settings (summary or trimmed / only trimmed). Also regardless of the length setting. And also if all filters are off.

OK, great.

As soon as I stray in some '<br>' or even just an plain newline (not markup), summary creation works. Regardless of the format. What's relevant here is the text to summarize.

Drupal through D7 did look for sentences as well as paragraphs and break tags as summary boundaries, if the first paragraph is too long. The Backdrop code looks like it is supposed to do the same ... cf. line 479 of text.module

$break_points[] = array('. ' => 1, '! ' => 1, '? ' => 1, '。' => 0, '؟ ' => 1);

But if the text_summary function detects (even incorrectly) that the "Convert line breaks" filter is on, then it adds \n as a key to the $break_points array with a weight/priority (?) of 1. This appears to chop off the summary at the very first \n, right after the <p> tag at the beginning. In any case, it doesn't work, which (I think) you confirmed.

leeksoup commented 8 months ago

With my code fix and with filter_autop turned off on formats that have CKE5 enabled, text_summary() does work. It should probably be documented that filter_autop and CKE5 don't play nice together.

indigoxela commented 8 months ago

filter_autop ... If so, yes, I agree. The code isn't properly checking if that filter is enabled.

No, toggling that filter on/off works just fine, and the filter's completely unrelated to the summary problem. I can reproduce the summary problem on a vanilla install of the latest dev (no D7 migration).

Re the function text_summary() in core/modules/field/modules/text/text.module: that code's more or less identical to the D7 code. It's just that D7 didn't have to deal with reformatted markup - but our CKE5 integration introduces that.

That it accidentally works for you, if you just turn off filter_autop might need further recherche. In my testing environment it didn't make any difference (latest dev, PHP 8.1). That's why I consider filter_autop as unrelated here.

My recommendation would be to always turn off filter_autop with any WYSIWYG (also url filter) - these make no sense with an editor (any editor).

leeksoup commented 8 months ago

That it accidentally works for you, if you just turn off filter_autop might need further recherche. In my testing environment it didn't make any difference (latest dev, PHP 8.1). That's why I consider filter_autop as unrelated here.

No, that isn't correct. The existing code does not work for me with the filter_autop either on or off. Edit: But that is because there is a bug in the code that doesn't properly recognize when filter_autop is disabled.

BUT if I make the code change I showed in https://github.com/backdrop/backdrop-issues/issues/6423#issuecomment-2002100421 then text_summary() works correctly as long as filter_autop is off. Edit: Which actually makes sense since this filter setting tells text_summary() to treat \n as semantically significant even though it is not semantically significant in HTML.

Sorry if I was unclear.

My recommendation would be to always turn off filter_autop with any WYSIWYG (also url filter) - these make no sense with an editor (any editor).

Agreed. But with a migrated site (and I'm betting I'm not the only one) one might not immediately think of turning off that filter when enabling an editor if it's something that's just sat there for many years. So it would be nice to document it. :)

indigoxela commented 8 months ago

BUT if I make the code change I showed in https://github.com/backdrop/backdrop-issues/issues/6423#issuecomment-2002100421 then text_summary() works correctly as long as filter_autop is off.

I stand corrected. :see_no_evil: Right, that part in text_summary() needs fixing, too. Check for status, not existence. :+1:

Are you able to provide a PR for that fix? It doesn't solve the general problem with reformatted markup, but it fixes the problem if the filter's off.

leeksoup commented 8 months ago

Are you able to provide a PR for that fix? It doesn't solve the general problem with reformatted markup, but it fixes the problem if the filter's off.

Done!

indigoxela commented 8 months ago

Done!

That's great! We're almost there...

You set the PR against your own fork, but actually you should have filed against the main repo. Can you close the wrong one, then open one against our repo, please?

leeksoup commented 8 months ago

That's great! We're almost there...

You set the PR against your own fork, but actually you should have filed against the main repo. Can you close the wrong one, then open one against our repo, please?

Oops! I think I did it correctly this time?

indigoxela commented 8 months ago

I think I did it correctly this time?

Yes, everything's perfect now. :+1:

yorkshire-pudding commented 7 months ago

Tested and works.

indigoxela commented 7 months ago

I also can confirm that this fixes the problem with zero length automatic summaries, when CKEditor 5 is used (or markup's formatted for other reasons) and the autop filter's off.

@leeksoup I've left a small suggestion (comment) to simplify the condition a bit - for your kind consideration. :wink:

indigoxela commented 7 months ago

This looks good to go now, RTBC :+1:

quicksketch commented 6 months ago

Thanks @leeksoup, @indigoxela, and @yorkshire-pudding! I merged https://github.com/backdrop/backdrop/pull/4678 into 1.x and 1.27.x. This was a sneaky issue and great work pinpointing the logic mistake in the code!