Figure element breaks the paragraphs structure (empty artifact)

leancept commented 11 months ago

Storyblok's default rich-text rendering renders image captions as a title attribute, so I used your project to define my own image component using figure and figcaption, like so:

ImageWithCaption.astro

---
interface Props {
  src: string;
  alt?: string;
  title?: string;
}

const { src, alt, title } = Astro.props;
---
<p>
  <figure>
    <img src={src} alt={alt} loading="lazy" />
    { title && <figcaption>{title}</figcaption> }
  </figure>
</p>

and

article.astro

    <RichTextRenderer
      content={body}
      schema={{
        nodes: {
          image: ({ attrs }) => ({
            component: ImageWithCaption,
            props: attrs,
          })
        }
      }}
      {...storyblokEditable(blok)} />

Problem is, when the rich text content is rendered, the node that follows an image node doesn't get rendered, instead the text content is output as-is. This is typically a paragraph, so it can be worked around with CSS by giving it as the same appearance as a p tag:

If you add an empty paragraph under the image in Storyblok's rich-text editor, the text is rendered as a paragraph using a p tag – which is what's expected:

Not sure if this is a bug in this project or if my code is at fault.

leancept commented 11 months ago

This is the doc array for one of the nodes that doesn't render:

[
  {
    type: 'image',
    attrs: {
      id: 12166147,
      alt: 'Shipyard',
      src: 'https://a.storyblok.com/f/246957/1024x682/abv123/7-pic.jpg',
      title: 'lorem ipsum dolor site amet.',
      source: '',
      copyright: '',
      meta_data: {}
    }
  },
  {
    text: 'lorem ipsum dolor site amet',
    type: 'text'
  },
  { text: 'italicized word', type: 'text', marks: [ [Object] ] },
  { text: '.', type: 'text' }
]

As compared to one that renders as expected:

[
  {
    type: 'image',
    attrs: {
      id: 12166139,
      alt: 'Description',
      src: 'https://a.storyblok.com/f/246957/1024x704/943bc1f8e9/signs.jpg',
      title: 'This is the caption.',
      source: '',
      copyright: '',
      meta_data: {}
    }
  }
]
[
  {
    text: 'This is some paragraph text',
    type: 'text'
  }
]

So this offers a clue sense since the element where the text is rendered without a tag consists of two nodes: image and text. Only the image node renders correctly. The text doesn't render.

edvinasjurele commented 11 months ago

Hi @leancept, thank you for using the package, let's break it down where is the issue.

I have tried to do reproducible https://stackblitz.com/edit/github-w6ndeh?file=src%2Fpages%2Findex.astro,src%2Fstoryblok%2FRichText.astro,src%2Fcomponents%2FImageWithCaption.astro,src%2Fstoryblok%2FButton.astro

I have tried both ways - with package and with no package. Both times I got same additional  artifact.

So I went into other tool just to render this HTML:

<p>
      <figure>
        <img
          src="https://dummyimage.com/300x200/eee/aaa"
          alt="Shipyard"
          loading="lazy"
        >
        <figcaption>lorem ipsum dolor site amet.</figcaption>
      </figure>
 </p>

see https://jsfiddle.net/edvinasjurele/ztu617xv/ ❌ , and I can reproduce same issue outside of storyblok-rich-text-astro-renderer package:

This implies that having p > figure is not a good idea in general.

Conclusions

I am certain that the browser itself do not like p > figure structure, thus it tries to resolve it as it can and produces that extra artifact as a side effect.

Cure (partial) from consumer side

Looking into your ImageWithCaption.astro solution, it seems that  is redundant there, as you should actually think about image itself and not about is it in paragraph or not, because the package handles that for you.

Change ImageWithCaption.astro implementation to

---
interface Props {
  src: string;
  alt?: string;
  title?: string;
}

const { src, alt, title } = Astro.props;
---

<figure>
  <img src={src} alt={alt} loading="lazy" />
  { title && <figcaption>{title}</figcaption> }
</figure>

and that should improve things from consumer side a little bit. But it does solve the issue entirely.

Bug on the library side

I have tried to do the minimal repro: https://stackblitz.com/edit/github-rh6lpz-tueqzc?file=src%2Fcomponents%2FPictureWithCaption.astro,src%2Fstoryblok%2FRichText.astro and it indeed has same issue ⚠️ ❌

Screenshot 2023-10-31 at 14 45 52

This is something to address. The question is how?

edvinasjurele commented 11 months ago

I am highly doubtful if the package should handle the figure inside paragraphs situation, because essentially rich text is a bunch of headings/paragraphs and all elements inside it meant to be inline elements, meaning using block elements, like div or figure are semantically incorrect:

The problem you bump into is the same as p > div issue. You have to match the semantics correctly in HTML file.

Here is what I would suggest:

Leave rich text built-in image node as simple inline img, because p > img situation is fine.

However, in case that does not work for you and you still need figure in your HTML, then you should do one of the following:

Change text node resolver to not be p element (and leave image implemented with caption), meaning that would f.e. turn p > figure cases into f.e. div > figure or span > figure. But this ruins the paragraphs, thus might not be the best idea to go with
[highly recommend] Use custom blocks for HTML block elements. Every inline block is rendered via StoryblokComponent meaning it is a block rendered in isolation and is not considered to be inside the paragraphs even though it is include in the same Storyblok rich text field. You can create imageWithCaption block in the block library and whenever you want to have the figure, just use the inline block as follows:

Let me know if that works for you @leancept ? Other than that, I would be up to close this issue 🙏

leancept commented 11 months ago

Thank you for a very detailed response and debugging.

I put the p tag in the component for debugging. It's not related to the issue. Block-level inside block-level doesn't seem to be the issue. In fact, it doesn't matter what the component contains; the issue is still there.

ImageWithCaption.astro

---
interface Props {
  src: string;
  alt?: string;
  title?: string;
}

const { src, alt, title } = Astro.props;
---
<em>{src}</em>

Renders as:

It should render as

<p>
  <em>...</em>
</p>
<p>
  lorem ipsum...
</p>

I haven't had a chance to dig deeper since I posted this issue. The renderer parses the content array recursively. But for some reason, when nodes are nested and one is a custom component, rendering stops. That's as far as I've come, anyway.

leancept commented 11 months ago

As an addition, noted em is an inline element. But you'd get the same with a block-level element such as p:

edvinasjurele commented 11 months ago

p > p is also a problem (you cannot nest p elements), so you may not use p inside your components implementations at all, because package renders all stuff to be in paragraphs always by default.

However I do not see an issue with this structure:

you can always add a class to change behaviour of your element: <img class="inline-block"> or similar tactics to get the visuals you need.

leancept commented 11 months ago

The structure is correct HTML, true. However, I made it to show that the rendering issue isn't dependent on the markup of the Astro component I made.

The problem with having unwrapped text following every image is that it's not styled.

The output I need is for text following an image is wrapped in a p.

So instead of what I get now:

<p>...</p>
<figure>...</figure>
Text
<p>...</p>

I'd like:

<p>...</p>
<figure>...</figure>
<p>Text</p>
<p>...</p>

This issue seems triggered when a text node follows an image node in the SB editor in my setup where I use a custom Astro component for images.

When text follows an image in the SB editor, it groups them in the doc array.

Simply pressing enter to insert an empty paragraph after every image in the SB editor fixes it.

leancept commented 11 months ago

This is what's going on here: https://html.spec.whatwg.org/multipage/grouping-content.html#the-p-element

Tag omission in text/html: A p element's end tag can be omitted if the p element is immediately followed by an address, article, aside, blockquote, details, div, dl, fieldset, figcaption, figure...

There are three solutions:

Transform the doc array so that all images, resulting in figure, are at the root level
Wrap the fieldset in a tag that isn't considered an implicit  – not pretty, but it's an option.
Do what you proposed and use Storyblok bloks and matching Astro components.

I want to avoid SB-specific solutions as much as possible to ease a future content migration so number 3 isn't ideal.

Anyway, thank you for the help debugging. I'll figure out a solution. Shame it's so difficult to render Storyblok content as accessible HTML5. The SB RTE should support figure as a block level element, just like how it supports paragraphs and blockquotes.

NordSecurity / storyblok-rich-text-astro-renderer

Figure element breaks the paragraphs structure (empty <p></p> artifact) #83

Conclusions

Cure (partial) from consumer side

Bug on the library side

Here is what I would suggest: