Using paragraph in tbl-cap or fig-cap cell options do not work and nothing prevents it

cderv commented 10 months ago

---
title: "Example"
format: html
keep-md: true
---

```{r}
#| label: tbl-test
#| tbl-cap: |
#|    Big table 
#|    
#|    Really small

knitr::kable(
  mtcars[1:6, 1:3]
)


I don't think this should be supported - and thus nothing prevents a rendering to a broken doc

![image](https://github.com/quarto-dev/quarto-cli/assets/6791940/5394a089-3aac-40b2-b733-778626def004)

Intermediate Md is 

````markdown
---
title: "Example"
format: html
keep-md: true
---

::: {#tbl-test .cell tbl-cap='Big table

Really small'}

```{.r .cell-code}
knitr::kable(
  mtcars[1:6, 1:3]
)

::: {.cell-output-display}

	mpg	cyl	disp
Mazda RX4	21.0	6	160
Mazda RX4 Wag	21.0	6	160
Datsun 710	22.8	4	108
Hornet 4 Drive	21.4	6	258
Hornet Sportabout	18.7	8	360
Valiant	18.1	6	225

::: :::



See how the new line in YAML will break the line on the fenced div attributes, creating an unsupported syntax

cderv commented 10 months ago

Same for figure by the way

---
title: "Example"
format: html
keep-md: true
---

```{r}
#| label: fig-cap-margin
#| fig-cap: |
#|   MPG vs horsepower, colored by transmission.
#|   
#|   Something else
library(ggplot2)
mtcars2 <- mtcars
mtcars2$am <- factor(
  mtcars$am, labels = c('automatic', 'manual')
)
ggplot(mtcars2, aes(hp, mpg, color = am)) +
  geom_point() +
  geom_smooth(formula = y ~ x, method = "loess") +
  theme(legend.position = 'bottom')


![image](https://github.com/quarto-dev/quarto-cli/assets/6791940/5418f14f-06da-40fe-b1d6-2b58c792792e)

````markdown
---
title: "Example"
format: html
keep-md: true
---

::: {.cell}

```{.r .cell-code}
library(ggplot2)
mtcars2 <- mtcars
mtcars2$am <- factor(
  mtcars$am, labels = c('automatic', 'manual')
)
ggplot(mtcars2, aes(hp, mpg, color = am)) +
  geom_point() +
  geom_smooth(formula = y ~ x, method = "loess") +
  theme(legend.position = 'bottom')

::: {.cell-output-display} ![MPG vs horsepower, colored by transmission.

Something else](index_files/figure-html/fig-cap-margin-1.png){#fig-cap-margin width=672} ::: :::



I believe we should avoid producing such intermediate markdown in both cases

cscheid commented 10 months ago

One trick Rich and I started using is to support base64 encoded strings in computationally-generated content. For example, you can include markdown in HTML tables by using <span data-qmd-base64="yourbase64contenthere"></span>

We could consider doing something similar here.

The main design complication is that I'd like to avoid adding a tbl-cap-base64 to tbl-cap (and similarly to all other such attributes, it would be a big mess to implement).

We could, for example, support the following syntax:

::: {#tbl-test .cell tbl-cap='__quarto__base64__contents__:yourbase64contentshere'}

Then, when we read those attributes in quarto (and we already have to parse those to markdown anyway), we can detect the __quarto__base64__contents__: string prefix and do the right thing.

All we need to do is to come with one such string like __quarto__base64__contents__ that:

will never be the start of a human-generated attribute
is clear enough for readers of the intermediate markdown to identify what's going on
only contains safe ascii characters (to avoid encoding issues)

cderv commented 10 months ago

Oh interesting !

Though at first I was thinking that we do no really support multi paragraph caption. Do we ?

Usually Markdown syntax how do that work ? Only last paragraph below tables is taken into account as caption.

Because for now tbl-cap is placed on the .cell div because it is forwarded there by knitr with no processing of tbl-cap.

We could just disallow new line and paste the strings together if tbl-cap or fig-cap is seen as a string with newlines. 🤷‍♂️

cscheid commented 10 months ago

Though at first I was thinking that we do no really support multi paragraph caption. Do we ?

It's complicated (surprise!).

I don't think there's anything stopping a caption from having multiple paragraphs in a FloatRefTarget ("float"), but we don't have a lot of ways to end up with multiple paragraphs there. The div syntax for floats requires a caption to be a single paragraph: src/resources/filters/quarto-pre/parsefiguredivs.lua:193, and refCaptionFromDiv is

function refCaptionFromDiv(el)
  local last = el.content[#el.content]
  if last and last.t == "Para" and #el.content > 1 then
    return last
  else
    return nil
  end
end

But there are other ways for captions to show up. The most direct one is a Lua filter that constructs a FloatRefTarget explicitly. In that case, caption can be pretty much anything, and we don't currently check for the type of content (it can even be a Blocks({...}) with two paragraphs, for example).

I wouldn't be surprised if there were other ways they can sneak in.

quarto-dev / quarto-cli

Using paragraph in tbl-cap or fig-cap cell options do not work and nothing prevents it #7656