quarto-dev / quarto-cli

Open-source scientific and technical publishing system built on Pandoc.
https://quarto.org
Other
3.77k stars 309 forks source link

H3 In `=html` Does Not Appear In TOC #9738

Closed jasonmm closed 4 months ago

jasonmm commented 4 months ago

Bug description

I am using Clay to create Quarto files. It produces this .qmd file. That file is rendered using the command-line quarto render --to html. In the browser that resulting HTML looks like the following, where the bottom level-3 heading does not appear in the table of contents next to the other level-3 headings. quarto-toc

Steps to reproduce

test.qmd


---
format:
  html: {toc: true, toc-depth: 4, theme: cosmo, toc-expand: 4, output-file: test.html}
code-block-background: true
include-in-header: {text: '<link rel = "icon" href = "data:," />'}

---
<style></style><style>.printedClojure .sourceCode {
  background-color: transparent;
  border-style: none;
}
</style><style>.clay-limit-image-width .clay-image {max-width: 100%}
</style>
<script src="test_files/md-default0.js" type="text/javascript"></script><script src="test_files/md-default1.js" type="text/javascript"></script>

::: {.sourceClojure}
```clojure
(ns test
  (:require
    [scicloj.kindly.v4.kind :as kind]))

:::

One Hash

Two Hashes

Three Hashes As Clj Comment

Three Hashes In kind/md

<div><h3>An H3 In `kind/hiccup`</h3></div>
<div style="background-color:grey;height:2px;width:100%;"></div>
<div></div>

### index.qmd

format: html: {toc: false}


Notebooks


### _quarto.yml

format: html: {toc: true, toc-depth: 4, theme: cosmo, toc-expand: 4} revealjs: {theme: solarized, navigation-mode: vertical, transition: slide, background-transition: fade, incremental: true} project: {type: book} book: title: Notebooks chapters: [index.qmd, test.qmd]


I place the above files in a directory and run `quarto render --to html`.

### Expected behavior

I expected all level-3 headings to appear in the table of contents.

### Actual behavior

A level-3 heading inside an `=html` block does not appear in the table of contents.

### Your environment

_No response_

### Quarto check output

```bash
Quarto 1.5.10
[✓] Checking versions of quarto binary dependencies...
      Pandoc version 3.1.11: OK
      Dart Sass version 1.69.5: OK
      Deno version 1.37.2: OK
[✓] Checking versions of quarto dependencies......OK
[✓] Checking Quarto installation......OK
      Version: 1.5.10
      Path: /Applications/quarto/bin

[✓] Checking tools....................OK
      TinyTeX: (not installed)
      Chromium: (not installed)

[✓] Checking LaTeX....................OK
      Tex:  (not detected)

[✓] Checking basic markdown render....OK

[✓] Checking Python 3 installation....OK
      Version: 3.11.6
      Path: /usr/local/opt/python@3.11/bin/python3.11
      Jupyter: (None)

      Jupyter is not available in this Python installation.
      Install with python3 -m pip install jupyter

[✓] Checking R installation...........(None)

      Unable to locate an installed version of R.
      Install R from https://cloud.r-project.org/
cscheid commented 4 months ago

This isn't a bug.

Quarto only considers top-level headers for the table of contents, and you have an h3 inside a div.

In addition, you should be using Markdown for headers that you expect to appear in the table of contents.

jasonmm commented 4 months ago

Thanks for the reply, I think I understand. If the H3 was outside the DIV, but still inside the {=html}, then it would appear in the table of contents?

you should be using Markdown for headers that you expect to appear in the table of contents.

The .qmd file is generated by Clay. I do not have full control over its creation.

mcanouil commented 4 months ago

I don't think so as the HTML scaffolding from ### title is not <h3>title from what I recall. You can inspect the produced HTML.

jasonmm commented 4 months ago

It looks like the HTML created by Quarto from the .qmd' s ### Three Hashes As Clj Comment is

<section id="three-hashes-as-clj-comment" class="level3" data-number="2.1.1">
<h3 data-number="2.1.1" class="anchored" data-anchor-id="three-hashes-as-clj-comment"><span class="header-section-number">2.1.1</span> Three Hashes As Clj Comment</h3>
</section>

Removing the surrounding DIV from inside the {=html} (as suggested previously), the .qmd's {=html} still becomes an H3, but it is inside the previous heading's SECTION tag. Which looks like this in the HTML produced by Quarto

<section id="three-hashes-in-kindmd" class="level3" data-number="2.1.2">
<h3 data-number="2.1.2" class="anchored" data-anchor-id="three-hashes-in-kindmd"><span class="header-section-number">2.1.2</span> Three Hashes In <code>kind/md</code></h3>
<h3 class="anchored">An H3 In `kind/hiccup`</h3>
<div style="background-color:grey;height:2px;width:100%;"></div>
<div></div>

</section>
jasonmm commented 4 months ago

This leads me to believe that Quarto only creates entries for the table of contents from markdown headers (i.e. #, ###, etc...) in the .qmd file. Would that be accurate?

cderv commented 4 months ago

I believe this is all related to Pandoc behavior and how it computes TOC, with also How Quarto calls Pandoc.

The section part you see is because Quarto calls pandoc with --section-divs (https://pandoc.org/MANUAL.html#option--section-divs)

Then TOC is generated when --toc flag is set, and AFAIK this applies at Markdown reading time by Pandoc which computes some Headers in AST to use with id for TOC. The Raw HTML is not parsed as a Header node in AST.

You can observe all this by calling bare pandoc directly and observe the results

> quarto pandoc --to html --toc -s --section-divs
# Hello

## Sub

### SubSub

```{=html}
<h3 id="hiccup">An H3 In `kind/hiccup`</h3>

This will generate a full HTML in console. The relevant part is that

- Pandoc parser won't consider the `<h3>` in Raw HTML to be a header for which section divs will apply. This means it will be added into the `<section>` for SubSub
    ```hml
    <section id="subsub" class="level3">
    <h3>SubSub</h3>
    <h3 id="hiccup">An H3 In `kind/hiccup`</h3>
    </section>

Hope this helps clarifies

The .qmd file is generated by Clay. I do not have full control over its creation.

Quarto works on .qmd file which follows a specific syntax based on Pandoc, and some features are really based on Pandoc structure syntax. For your specific use case, you would need a specific Reader that process the Clay specificity before passing to Markdown reader. Quarto does not accept custom reader but you could probably find a way to convert your Clay rendered document to a Pandoc Markdown document that would work with Quarto. Not sure it would work, but could be.

jasonmm commented 4 months ago

All the answers have been clear. I definitely have a better understanding of how Quarto works and what it expects as input. Thank you!