quarto-dev / quarto-cli

Open-source scientific and technical publishing system built on Pandoc.
https://quarto.org
Other
4k stars 329 forks source link

JSON filters broken for .Rmd files, after `quarto` #7215

Closed matthew-brett closed 1 year ago

matthew-brett commented 1 year ago

Bug description

I am using Panflute JSON filters for my project.

Installing the latest quarto-cli from Git broke my build - Quarto raises an error of form:

Could not run /Users/mb312/dev_trees/test-quarto-2/a_filter as a JSON filter.
Please make sure the file exists and is executable.

After some tedious trial and error, I have tracked this down to the following combination of circumstances.

The error appears to require all three circumstances.

The build works correctly with quarto-cli version 1.4.51, but fails for current main branch and latest release 1.4.412.

Steps to reproduce

git clone https://github.com/matthew-brett/test-quarto-2
cd test-quarto-2
quarto render

Expected behavior

Book builds without error.

Actual behavior

I get the following output:

$ quarto render
[1/2] index.qmd

processing file: index.qmd
1/2 [unnamed-chunk-1]
2/2                  
output file: index.knit.md

[2/2] intro.Rmd

processing file: intro.Rmd
1/3                  
2/3 [unnamed-chunk-1]
3/3                  
output file: intro.knit.md

Traceback (most recent call last):
  File "/Users/mb312/dev_trees/test-quarto-2/a_filter", line 19, in <module>
    main()
  File "/Users/mb312/dev_trees/test-quarto-2/a_filter", line 13, in main
    return run_filter(action,
  File "/Users/mb312/dev_trees/panflute/panflute/io.py", line 227, in run_filter
    return run_filters([action], *args, **kwargs)
  File "/Users/mb312/dev_trees/panflute/panflute/io.py", line 200, in run_filters
    doc = load(input_stream=input_stream)
  File "/Users/mb312/dev_trees/panflute/panflute/io.py", line 58, in load
    doc = json.load(input_stream, object_hook=from_json)
  File "/opt/homebrew/Cellar/python@3.10/3.10.13/Frameworks/Python.framework/Versions/3.10/lib/python3.10/json/__init__.py", line 293, in load
    return loads(fp.read(),
  File "/opt/homebrew/Cellar/python@3.10/3.10.13/Frameworks/Python.framework/Versions/3.10/lib/python3.10/json/__init__.py", line 359, in loads
    return cls(**kw).decode(s)
  File "/opt/homebrew/Cellar/python@3.10/3.10.13/Frameworks/Python.framework/Versions/3.10/lib/python3.10/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/opt/homebrew/Cellar/python@3.10/3.10.13/Frameworks/Python.framework/Versions/3.10/lib/python3.10/json/decoder.py", line 353, in raw_decode
    obj, end = self.scan_once(s, idx)
  File "/Users/mb312/dev_trees/panflute/panflute/elements.py", line 1431, in from_json
    return _res_func[tag](c)
  File "/Users/mb312/dev_trees/panflute/panflute/elements.py", line 1378, in <lambda>
    'RawInline': lambda c: RawInline(
  File "/Users/mb312/dev_trees/panflute/panflute/elements.py", line 787, in __init__
    self.format = check_group(format, RAW_FORMATS)
  File "/Users/mb312/dev_trees/panflute/panflute/utils.py", line 78, in check_group
    raise TypeError(msg)
TypeError: element str not in group {'docx', 'tikiwiki', 'jats', 'ipynb', 'muse', 'markdown_strict', 'rst', 'org', 'man', 'icml', 'odt', 'vimwiki', 'markdown_github', 'html', 'gfm', 'dokuwiki', 'rtf', 'json', 'native', 'textile', 'noteref', 't2t', 'commonmark', 'opml', 'creole', 'fb2', 'opendocument', 'epub', 'markdown_phpextra', 'mediawiki', 'docbook', 'haddock', 'twiki', 'latex', 'openxml', 'context', 'markdown', 'markdown_mmd', 'tex'}
FATAL (/Users/mb312/dev_trees/quarto-cli/src/resources/filters/./common/wrapped-filter.lua:129) An error occurred:
Could not run /Users/mb312/dev_trees/test-quarto-2/a_filter as a JSON filter.
Please make sure the file exists and is executable.

Did you intend 'a_filter' as a Lua filter in an extension?
If so, make sure you've spelled the name of the extension correctly.

The original Pandoc error follows below.
Error running filter /Users/mb312/dev_trees/test-quarto-2/a_filter:
Filter returned error status 1
Error running filter /Users/mb312/dev_trees/quarto-cli/src/resources/filters/main.lua:
..._trees/quarto-cli/src/resources/filters/./common/log.lua:30: attempt to call a nil value (global 'crash_with_stack_trace')
stack traceback:
    ...rees/quarto-cli/src/resources/filters/./common/error.lua:14: in function 'fail'
    ...to-cli/src/resources/filters/./common/wrapped-filter.lua:129: in field 'Pandoc'
    ...s/quarto-cli/src/resources/filters/./ast/customnodes.lua:72: in function 'run_emulated_filter'
    .../quarto-cli/src/resources/filters/./ast/runemulation.lua:40: in local 'callback'
    .../quarto-cli/src/resources/filters/./ast/runemulation.lua:54: in upvalue 'run_emulated_filter_chain'
    .../quarto-cli/src/resources/filters/./ast/runemulation.lua:89: in function <.../quarto-cli/src/resources/filters/./ast/runemulation.lua:86>
ERROR: Error
    at renderFiles (file:///Users/mb312/dev_trees/quarto-cli/src/command/render/render-files.ts:334:23)
    at eventLoopTick (ext:core/01_core.js:181:11)
    at async renderProject (file:///Users/mb312/dev_trees/quarto-cli/src/command/render/project.ts:266:23)
    at async Command.fn (file:///Users/mb312/dev_trees/quarto-cli/src/command/render/cmd.ts:212:26)
    at async Command.execute (file:///Users/mb312/dev_trees/quarto-cli/src/vendor/deno.land/x/cliffy@v0.25.4/command/command.ts:1790:7)
    at async quarto (file:///Users/mb312/dev_trees/quarto-cli/src/quarto.ts:135:3)
    at async file:///Users/mb312/dev_trees/quarto-cli/src/quarto.ts:167:5
``` @

### Your environment

Macos Sonoma 14.0 on M2.

### Quarto check output

```bash
$ quarto check
Quarto 99.9.9

[✓] Checking versions of quarto binary dependencies...
      Pandoc version 3.1.8: OK
      Dart Sass version 1.55.0: OK
      Deno version 1.33.4: OK

[✓] Checking versions of quarto dependencies......OK

[✓] Checking Quarto installation......OK
      Version: 99.9.9
      Path: /Users/mb312/dev_trees/quarto-cli/package/dist/bin

(|) Checking tools....................
(/) Checking tools....................
(-) Checking tools....................
[✓] Checking tools....................OK
      TinyTeX: (not installed)
      Chromium: (not installed)

(|) Checking LaTeX....................
(/) Checking LaTeX....................
(-) Checking LaTeX....................
(\) Checking LaTeX....................
(|) Checking LaTeX....................
(/) Checking LaTeX....................
(-) Checking LaTeX....................
[✓] Checking LaTeX....................OK
      Using: Installation From Path
      Path: /Library/TeX/texbin
      Version: 2022

(|) Checking basic markdown render....
(/) Checking basic markdown render....
(-) Checking basic markdown render....
(\) Checking basic markdown render....
[✓] Checking basic markdown render....OK

(|) Checking Python 3 installation....
(/) Checking Python 3 installation....
[✓] Checking Python 3 installation....OK
      Version: 3.10.13
      Path: /Users/mb312/.virtualenvs/resampling-with/bin/python3
      Jupyter: 5.3.0
      Kernels: python3, ir

(|) Checking Jupyter engine render....
(/) Checking Jupyter engine render....
(-) Checking Jupyter engine render....
(\) Checking Jupyter engine render....
(|) Checking Jupyter engine render....
(/) Checking Jupyter engine render....
(-) Checking Jupyter engine render....
(\) Checking Jupyter engine render....
(|) Checking Jupyter engine render....
(/) Checking Jupyter engine render....
(-) Checking Jupyter engine render....
(\) Checking Jupyter engine render....
(|) Checking Jupyter engine render....
(/) Checking Jupyter engine render....
(-) Checking Jupyter engine render....
(\) Checking Jupyter engine render....
(|) Checking Jupyter engine render....
(/) Checking Jupyter engine render....
(-) Checking Jupyter engine render....
(\) Checking Jupyter engine render....
(|) Checking Jupyter engine render....
(/) Checking Jupyter engine render....
(-) Checking Jupyter engine render....
[✓] Checking Jupyter engine render....OK

(|) Checking R installation...........
[✓] Checking R installation...........OK
      Version: 4.3.1
      Path: /opt/homebrew/Cellar/r/4.3.1/lib/R
      LibPaths:
        - /Users/mb312/Library/R/arm64/4.3/library
        - /opt/homebrew/lib/R/4.3/site-library
        - /opt/homebrew/Cellar/r/4.3.1/lib/R/library
      knitr: 1.44
      rmarkdown: 2.24

(|) Checking Knitr engine render......
(/) Checking Knitr engine render......
(-) Checking Knitr engine render......
(\) Checking Knitr engine render......
(|) Checking Knitr engine render......
[✓] Checking Knitr engine render......OK
cscheid commented 1 year ago

Replacing your a_filter filter with a pass-through filter makes this render succeed:

#!/usr/bin/env python3
import sys
s = sys.stdin.read()
print(s)

As you can see from the error message:

...
  File "/Users/mb312/dev_trees/panflute/panflute/elements.py", line 1378, in <lambda>
    'RawInline': lambda c: RawInline(
  File "/Users/mb312/dev_trees/panflute/panflute/elements.py", line 787, in __init__
    self.format = check_group(format, RAW_FORMATS)
  File "/Users/mb312/dev_trees/panflute/panflute/utils.py", line 78, in check_group
    raise TypeError(msg)
TypeError: element str not in group {'docx', 'tikiwiki', 'jats', 'ipynb', 'muse', 'markdown_strict', 'rst', 'org', 'man', 'icml', 'odt', 'vimwiki', 'markdown_github', 'html', 'gfm', 'dokuwiki', 'rtf', 'json', 'native', 'textile', 'noteref', 't2t', 'commonmark', 'opml', 'creole', 'fb2', 'opendocument', 'epub', 'markdown_phpextra', 'mediawiki', 'docbook', 'haddock', 'twiki', 'latex', 'openxml', 'context', 'markdown', 'markdown_mmd', 'tex'}
FATAL (/Users/mb312/dev_trees/quarto-cli/src/resources/filters/./common/wrapped-filter.lua:129) An error occurred:
Could not run /Users/mb312/dev_trees/test-quarto-2/a_filter as a JSON filter.

the error is happening inside Panflute. This is a Panflute issue, where they're checking for valid formats for RawInlines, but Pandoc is emitting a format that it's not expecting.

Specifically, Panflute is not expecting to see a typst raw block. I believe any RawBlock format string is valid, and that Panflute should not be checking for a set of strings. The RawBlock behavior in Pandoc is writer-specific, but most writers ignore RawBlocks of different formats.

This is pointing to something goofy happening on quarto, which is that the typst render fixup is happening in non-typst formats. I'll fix that right away and you should be able to use your Panflute filter again.

With that said, I think this a bug that needs reporting on Panflute. The reason you're seeing it is that between quarto 1.4.15 and the current version, Pandoc added typst support, and Panflute appears to 1) not have caught up yet 2) handle rawblocks in a way that I would describe as wrong.

cscheid commented 1 year ago

Here's a pure-pandoc repro that shows this isn't a quarto bug:

$ cat bad.md
A `raw typst inline`{=typst}.
$ pandoc bad.md --filter a_filter
Traceback (most recent call last):
  File "/Users/cscheid/Desktop/daily-log/2023/10/12/test-quarto-2/./a_filter", line 19, in <module>
    main()
  File "/Users/cscheid/Desktop/daily-log/2023/10/12/test-quarto-2/./a_filter", line 13, in main
    return run_filter(action,
  File "/Users/cscheid/virtualenvs/homebrew-python3/lib/python3.10/site-packages/panflute/io.py", line 227, in run_filter
    return run_filters([action], *args, **kwargs)
  File "/Users/cscheid/virtualenvs/homebrew-python3/lib/python3.10/site-packages/panflute/io.py", line 200, in run_filters
    doc = load(input_stream=input_stream)
  File "/Users/cscheid/virtualenvs/homebrew-python3/lib/python3.10/site-packages/panflute/io.py", line 58, in load
    doc = json.load(input_stream, object_hook=from_json)
  File "/opt/homebrew/Cellar/python@3.10/3.10.12_1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/json/__init__.py", line 293, in load
    return loads(fp.read(),
  File "/opt/homebrew/Cellar/python@3.10/3.10.12_1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/json/__init__.py", line 359, in loads
    return cls(**kw).decode(s)
  File "/opt/homebrew/Cellar/python@3.10/3.10.12_1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/opt/homebrew/Cellar/python@3.10/3.10.12_1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/json/decoder.py", line 353, in raw_decode
    obj, end = self.scan_once(s, idx)
  File "/Users/cscheid/virtualenvs/homebrew-python3/lib/python3.10/site-packages/panflute/elements.py", line 1431, in from_json
    return _res_func[tag](c)
  File "/Users/cscheid/virtualenvs/homebrew-python3/lib/python3.10/site-packages/panflute/elements.py", line 1378, in <lambda>
    'RawInline': lambda c: RawInline(
  File "/Users/cscheid/virtualenvs/homebrew-python3/lib/python3.10/site-packages/panflute/elements.py", line 787, in __init__
    self.format = check_group(format, RAW_FORMATS)
  File "/Users/cscheid/virtualenvs/homebrew-python3/lib/python3.10/site-packages/panflute/utils.py", line 78, in check_group
    raise TypeError(msg)
TypeError: element str not in group {'twiki', 'noteref', 'html', 'docx', 'markdown_strict', 'dokuwiki', 'markdown_github', 'creole', 'epub', 'mediawiki', 'openxml', 'vimwiki', 'opml', 'context', 'markdown_mmd', 'jats', 'muse', 'gfm', 'fb2', 'odt', 't2t', 'docbook', 'json', 'opendocument', 'tikiwiki', 'rst', 'latex', 'markdown_phpextra', 'icml', 'native', 'man', 'haddock', 'tex', 'org', 'rtf', 'commonmark', 'ipynb', 'markdown', 'textile'}
Error running filter a_filter:
Filter returned error status 1
matthew-brett commented 1 year ago

Thanks - that's very helpful. For the bug-report to Panflute - would you consider doing that? If I were to do it, I couldn't do much better than copy-pasting your text here - I don't have deep experience with JSON filters.

matthew-brett commented 1 year ago

@cscheid - just checking - would you consider submitting the bug report to Panflute? It will be much more likely to succeed if you do. Otherwise, I'll do my best, and refer back here - but I suppose either way you'll probably end up in the conversation.

cscheid commented 1 year ago

@matthew-brett I can't make it a priority unfortunately, so if this is time sensitive for you, then you're the one who'll have to do it.