quarto-dev / quarto-cli

Open-source scientific and technical publishing system built on Pandoc.
https://quarto.org
Other
3.57k stars 293 forks source link

Indented code cells in enumerated lists render with fenced div indicators that shouldn't be visible on the output #7551

Open mine-cetinkaya-rundel opened 7 months ago

mine-cetinkaya-rundel commented 7 months ago

Bug description

Including computational plot or table output in enumerated lists requires the code to be indented in the source in order to benefit from automatic numbering. However this results in fenced divs indicators appearing in the output when they shouldn't. See screenshot below.

Screenshot 2023-11-12 at 2 01 23 PM

Steps to reproduce

---
format: html
---

1. Item 1 text.

    ```{r}
    plot(cars)
  1. Item 2 text.

    cars |> head() |> knitr::kable()

Expected behavior

Did not expect ::: {.cell} and ::: {.cell-output-display} to be on the output.

Actual behavior

No response

Your environment

Quarto check output

Quarto 1.4.489
[✓] Checking versions of quarto binary dependencies...
      Pandoc version 3.1.8: OK
      Dart Sass version 1.55.0: OK
      Deno version 1.33.4: OK
[✓] Checking versions of quarto dependencies......OK
[✓] Checking Quarto installation......OK
      Version: 1.4.489
      Path: /Applications/quarto/bin

[✓] Checking tools....................OK
      TinyTeX: v2023.08
      Chromium: (not installed)

[✓] Checking LaTeX....................OK
      Using: TinyTex
      Path: /Users/mine/Library/TinyTeX/bin/universal-darwin
      Version: 2023

[✓] Checking basic markdown render....OK

[✓] Checking Python 3 installation....OK
      Version: 3.9.6
      Path: /Library/Developer/CommandLineTools/usr/bin/python3
      Jupyter: 5.3.0
      Kernels: python3

[✓] Checking Jupyter engine render....OK

[✓] Checking R installation...........OK
      Version: 4.3.1
      Path: /Library/Frameworks/R.framework/Resources
      LibPaths:
        - /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library
      knitr: 1.45
      rmarkdown: 2.25

[✓] Checking Knitr engine render......OK
mcanouil commented 7 months ago

It's because your indentation is not correct. Elements should be aligned with the first character of the list. This is Pandoc definition. knitr/rmarkdown for a reason I don't know adopted the rule of list element content should be indented with four spaces. The following will work properly:

---
format: html
---

1.  Item 1 text.

    ```{r}
    plot(cars)
  1. Item 2 text.

    cars |> head() |> knitr::kable()

or

---
format: html
---

1. Item 1 text.

   ```{r}
   plot(cars)
  1. Item 2 text.

    cars |> head() |> knitr::kable()
cderv commented 7 months ago

knitr/rmarkdown for a reason I don't know adopted the rule of list element content should be indented with four spaces.

No sure to see what is surprising with the knitr thing here. The code cells is indented and thus the output will have the same indentation.

So it will produce this Mardown

---
format: html
keep-md: true
---

1. Item 1 text.

    ::: {.cell}

    ```{.r .cell-code}
    plot(cars)
::: {.cell-output-display}
![](index_files/figure-html/unnamed-chunk-1-1.png){width=672}
:::
:::
  1. Item 2 text.

    ::: {.cell}

    cars |> head() |> knitr::kable()

    ::: {.cell-output-display}

    speed dist
    4 2
    4 10
    7 4
    7 22
    8 16
    9 10

    ::: :::



And this markdown won't be parsed as expected by Pandoc because the sub list item is indented to much. This leads to Fenced Div not being parsed by Pandoc, and return as is in the output. 

This is indeed related to indentation of Block element under a list. From Pandoc's Manual: https://pandoc.org/MANUAL.html#block-content-in-list-items

> A list item may contain multiple paragraphs and other block-level content. **However, subsequent paragraphs must be preceded by a blank line and indented to line up with the first non-space content after the list marker**.
mcanouil commented 7 months ago

@cderv knitr/rmarkdown does not require (or did not require) to have the content aligned with the first character of the item. Unless there was a change in knitr about list, the code showed works as expected in rmarkdown. I've got bitten by this when I first tried Quarto back in April 2022.

cderv commented 7 months ago

Oh ok I understand. You need to compare with similar content though. rmarkdown + knitr does not produce Fenced div around output. This is what creates the parsing problem with Pandoc, which does not parse the fenced div as such. So with regular code chunks in R Markdown world it works - because Pandoc seems to correctly handle when no fenced div. (🤷‍♂️). Quarto introduce Fenced div for each computation outputs, so parsing issue always happen !

Example:

---
output: 
  html_document:
    keep_md: true
---

1. Item 1 text.

    ```{r, echo = FALSE}
    res <- knitr::knit_child(text = c(
      "::: {.div}", 
      "```{r}",
      "plot(cars)",
      "```",
      ":::"), quiet = TRUE)
    knitr::asis_output(unlist(res))
  1. Item 2 text.

    cars |> head() |> knitr::kable()

The first chunk will produce a `::: {.div}` in the output - which is not parsed . 
![image](https://github.com/quarto-dev/quarto-cli/assets/6791940/bcb54185-fb5f-462a-a9ad-18fb588ceb92)

unless the indentation is correct, i.e. 

````markdown
1.  Item 1 text.

    ```{r, echo = FALSE}
    res <- knitr::knit_child(text = c(
      "::: {.div}", 
      "```{r}",
      "plot(cars)",
      "```",
      ":::"), quiet = TRUE)
    knitr::asis_output(unlist(res))

This is a Pandoc behavior, as shown in this bare pandoc example

1. First item has not correct indentation, which leads to `::: {.mydiv}` not being parsed
2. Second one has same indentation for list sentence and sub sequent paragraph, which leads to `::: {.mydiv}` being parsed to `<div class = "mydiv>` 

````powershell 
> pandoc -t html
1. following paragraph is indented 4 spaces, which is one more that this first one

    ::: {.mydiv}
    content
    :::

1.  This first line is now one space more to be align with the 4 spaces below

    ::: {.mydiv}
    content
    :::
^Z
<ol type="1">
<li><p>following paragraph is indented 4 spaces, which is one more that
this first one</p>
<p>::: {.mydiv} content :::</p></li>
<li><p>This first line is now one space more to be align with the 4
spaces below</p>
<div class="mydiv">
<p>content</p>
</div></li>
</ol>

I don't know really what we can work around unless pre-processing file before Pandoc parsing, but this would probably be prone to error to modify the user's file

mcanouil commented 7 months ago

At the time I asked about this, JJ answered basically: sub elements have to be aligned on the first character. Note that I am not aware of markdown linters that detect this.

cderv commented 7 months ago

Note that I am not aware of markdown linters that detect this.

The Visual Editor would modify the source file to align properly in this case

---
format: html
---

1. Item 1 text.

    ```{r}
    plot(cars)
  1. Item 2 text.

    cars |> head() |> knitr::kable()

it would add one space more before item 1. Item.

Not a linter, but helps with this indenting challenge;

At the time I asked about this, JJ answered basically: sub elements have to be aligned on the first character.

Yes that is pandoc's rule. It just seems to be more forgiving around other types of Blocks, by still doing the right thing. Not sure why... 🤷‍♂️

some Pandoc parsing tests ````powershell pandoc -t html 1. code ```r plot(cars) ``` 2. blockquotes > Citation 3. Fenced div ::: mydiv content ::: ^Z
  1. code

    plot(cars)
  2. blockquotes

    Citation

  3. Fenced div

    ::: mydiv content :::

````
mcanouil commented 7 months ago

I think it is mostly a matter of teaching the rule rather than automatically fixing it.

mine-cetinkaya-rundel commented 7 months ago

Agreed, this is a "this used to work in rmarkdown (well, bookdown), it gives an unexpected result in Quarto" issue. The fix is clear to me now but I imagine others converting from bookdown to Quarto might stumble upon it too. Not sure what the solution for that is, other than maybe emphasizing things in docs (I can think about how to do that too).

cderv commented 7 months ago

Not sure what the solution for that is, other than maybe emphasizing things in docs (I can think about how to do that too).

I think we need to do that yes. Not sure what is the best way (in text, or special callout in some places...).

Maybe this is a good topic for a blog post. Ex: How markup matters and how the visual editor can help ? 🤔

mine-cetinkaya-rundel commented 7 months ago

Blog post sounds good to me! I'd be happy to work on that so feel free to assign this to me.

mcanouil commented 7 months ago

The following page on markdown list could be improved as even the examples in it are not quite standard and does not highlight Pandoc nested rule: https://quarto.org/docs/authoring/markdown-basics.html#lists