quarto-dev / quarto-cli

Open-source scientific and technical publishing system built on Pandoc.
https://quarto.org
Other
3.96k stars 326 forks source link

Adding label to callout in filter fails to generate correct xref #11362

Open matthew-brett opened 5 days ago

matthew-brett commented 5 days ago

Bug description

When writing a filter for callouts, but not sections - Quarto does not correctly detect xref label that I have added in the filter.

See: https://github.com/matthew-brett/filter-notes-no-xrefs

In this repository, I am trying, through filtering, to generate the equivalent output as I would get from the following direct.qmd input (also in the repository:

---
title: Direct version of page with note
---

## My direct header {#sec-my-direct-section}

See @sec-my-direct-section

::: {#nte-my-direct-note .callout-note}
## My direct heading
:::

See @nte-my-direct-note.

I am manipulating the following input .qmd document using a couple of trivial (in my case) Panflute Pandoc JSON filters:

---
title: Filtered version of page with note
---

## My header mark-for-section-target

See @sec-my-filtered-section

::: {.callout-note}
## My filtered heading
:::

See @nte-my-filtered-note.

My filters replace the AST for ## My header mark-for-section-target with ## My filtered header {#sec-my-filtered-section} - and this works correctly to generate the section xref target. But my filters also replace the AST for the callout note with:

::: {#nte-my-filtered-note .callout-note}
## My filtered heading
:::

and this fails - both in generating correct HTML output for the callout note, and in generating the xref for the callout note. See below for the steps to reproduce, and more detail.

Steps to reproduce

See: https://github.com/matthew-brett/filter-notes-no-xrefs.

I believe my filters are correct, and I have explicitly specified these filters should run before quarto, although I believe this is the default. The filters are add_section_target.py and add_callout_target.py in the repository.

To reproduce, clone the repository. quarto render direct.qmd generates the expected HTML output without warnings. quarto render filtered.qmd gives the warning WARNING (/Users/mb312/dev_trees/quarto-cli/src/resources/filters/./crossref/refs.lua:127) Unable to resolve crossref @nte-my-filtered-note showing that the callout note filter output was not interpreted correctly, confirmed by the output HTML. quarto render direct.qmd --to markdown generates this (as expected):

---
title: Direct version of page with note
toc-title: Table of contents
---

## My direct header {#sec-my-direct-section}

See [Section 1](#sec-my-direct-section){.quarto-xref}

::: {#nte-my-direct-note}
> **Note 1: My direct heading**
:::

See [Note 1](#nte-my-direct-note){.quarto-xref}.

However, quarto render filtered.qmd --to markdown generates this:

---
title: Filtered version of page with note
toc-title: Table of contents
---

## My filtered header {#sec-my-filtered-section}

See [Section 1](#sec-my-filtered-section){.quarto-xref}

<div>

> **My filtered heading**

</div>

See **?@nte-my-filtered-note**.

Expected behavior

See above. The filtered output should be equivalent to the following direct output, and adding the xref to the Markdown, via the filter, should generate the xref target, as for the direct version.

Actual behavior

See above.

Your environment

Quarto check output

Quarto check output ``` Quarto 99.9.9 [✓] Checking environment information... Quarto cache location: /Users/mb312/Library/Caches/quarto [✓] Checking versions of quarto binary dependencies... Pandoc version 3.4.0: OK Dart Sass version 1.70.0: OK Deno version 1.46.3: OK Typst version 0.11.0: OK [✓] Checking versions of quarto dependencies......OK [✓] Checking Quarto installation......OK Version: 99.9.9 commit: adc7186f0ee20ed125588cd603330beb2e76bc71 Path: /Users/mb312/dev_trees/quarto-cli/package/dist/bin [✓] Checking tools....................OK TinyTeX: (not installed) Chromium: (not installed) [✓] Checking LaTeX....................OK Using: Installation From Path Path: /Library/TeX/texbin Version: 2024 [✓] Checking basic markdown render....OK [✓] Checking Python 3 installation....OK Version: 3.10.14 Path: /Users/mb312/.virtualenvs/resampling-with/bin/python3 Jupyter: 5.3.0 Kernels: python3, ir, xibabel, exercises [✓] Checking Jupyter engine render....OK [✓] Checking R installation...........OK Version: 4.4.1 Path: /opt/homebrew/Cellar/r/4.4.1/lib/R LibPaths: - /Users/mb312/Library/R/arm64/4.4/library - /opt/homebrew/lib/R/4.4/site-library - /opt/homebrew/Cellar/r/4.4.1/lib/R/library knitr: 1.47 rmarkdown: 2.28 [✓] Checking Knitr engine render......OK ```
matthew-brett commented 4 days ago

Yes, it looks like the callout blocks have been comprehensively pre-processed before they reach the filters. For example, even if I add a class to a callout note in source, it is stripped before it reaches the filter. That is:

::: {.callout-note .my-class}
:::

becomes:

::::: {__quarto_custom="true" __quarto_custom_type="Callout" __quarto_custom_context="Block" __quarto_custom_id="1"}
::: {__quarto_custom_scaffold="true"}
:::

::: {__quarto_custom_scaffold="true"}
:::
:::::

by the time the filter gets it.

Is there a way to filter before Quarto does this kind of preprocessing?

mcanouil commented 4 days ago

Yes, there is a way to change the order:

(It's a bit hidden for now, but this documentation issue is tracked by other issue)

matthew-brett commented 4 days ago

Aha - yes - thanks - specifying pre-ast does cause the filters to run on the unprocessed AST.

But I noticed, post-processing, Quarto discards any extra classes or attributes to the callout div. For this input:

::: {.callout-note .nb-end foo="bar"}
## Another callout
:::

I get the Markdown equivalent of this output when running in a post-quarto filter:

::::: {__quarto_custom="true" __quarto_custom_type="Callout" __quarto_custom_context="Block" __quarto_custom_id="2"}
::: {__quarto_custom_scaffold="true"}
Another callout
:::

::: {__quarto_custom_scaffold="true"}
:::
:::::

This is an issue because I want to mark some callout blocks with extra classes, so I can use them in a post-quarto filter. Is there any way of preserving these classes and / or attributes?

Just FYI, although the documentation above says that any extension other than .lua implies a JSON filter, in fact I had to specify type: json for my .py filter, as in:

filters:
  - at: pre-ast
    path: add_section_target.py
    type: json
  - at: pre-ast
    path: add_callout_target.py
    type: json

Without the explicit type: json I get:

ERROR (/Users/mb312/dev_trees/quarto-cli/src/resources/filters/./common/wrapped-filter.lua:200) add_section_target.py:3: syntax error near 'panflute'
cscheid commented 3 days ago

I have an in-progress PR open that will address this by making it possible to handle custom nodes in JSON filters https://github.com/quarto-dev/quarto-cli/pull/11241 but the changes were too big to do in 1.6.