Closed dmitryperets closed 6 months ago
That's a good point. I think a way to solve this is to make the default behaviour emit a warning. Meanwhile, we can provide an environment variable to force an error if it's set.
The latest commit should address this issue. You could install the latest version from git if you want to test it immediately
Hi @DCsunset,
With the latest commit, I get this error:
Traceback (most recent call last):
File "/usr/local/bin/pandoc-include", line 8, in <module>
sys.exit(main())
File "/usr/local/lib/python3.10/dist-packages/pandoc_include/main.py", line 408, in main
return pf.run_filter(action, doc=doc)
File "/usr/local/lib/python3.10/dist-packages/panflute/io.py", line 227, in run_filter
return run_filters([action], *args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/panflute/io.py", line 208, in run_filters
doc = doc.walk(action, doc=doc, stop_if=stop_if)
File "/usr/local/lib/python3.10/dist-packages/panflute/base.py", line 264, in walk
child = child.walk(action, doc, stop_if)
File "/usr/local/lib/python3.10/dist-packages/panflute/base.py", line 264, in walk
child = child.walk(action, doc, stop_if)
File "/usr/local/lib/python3.10/dist-packages/panflute/containers.py", line 152, in walk
ans = [(k, v) for k, v in ans if v != []]
File "/usr/local/lib/python3.10/dist-packages/panflute/containers.py", line 152, in <listcomp>
ans = [(k, v) for k, v in ans if v != []]
File "/usr/local/lib/python3.10/dist-packages/panflute/containers.py", line 151, in <genexpr>
ans = ((k, v.walk(action, doc, stop_if)) for k, v in self.items())
File "/usr/local/lib/python3.10/dist-packages/panflute/base.py", line 264, in walk
child = child.walk(action, doc, stop_if)
File "/usr/local/lib/python3.10/dist-packages/panflute/containers.py", line 86, in walk
ans = list(chain.from_iterable(ans))
File "/usr/local/lib/python3.10/dist-packages/panflute/containers.py", line 84, in <genexpr>
ans = ((item,) if type(item) is not list else item for item in ans)
File "/usr/local/lib/python3.10/dist-packages/panflute/containers.py", line 82, in <genexpr>
ans = (item.walk(action, doc, stop_if) for item in self)
File "/usr/local/lib/python3.10/dist-packages/panflute/base.py", line 272, in walk
altered = action(self, doc)
File "/usr/local/lib/python3.10/dist-packages/pandoc_include/main.py", line 238, in action
options = parseOptions(doc)
File "/usr/local/lib/python3.10/dist-packages/pandoc_include/config.py", line 110, in parseOptions
if options["process-path"] is None:
KeyError: 'process-path'
Error running filter pandoc-include:
Filter returned error status 1
Note that I am running it all in docker, and this is how I installed your latest fix (might have done it wrong?):
RUN pip3 install --force-reinstall git+https://github.com/DCsunset/pandoc-include.git#egg=pandoc-include
Let's ignore the problem with process-path
, maybe that's something with my environment... I've moved to the local machine (no docker), and still I have these failures:
(local_pandoc) dperets@dperets-mac wptest % cat test-include.md
!include filters.md
(local_pandoc) dperets@dperets-mac wptest % cat filters.md
### pandoc-include
!include <file-doesnt-exist>
$include <file-doesnt-exist>
```
!include <file>
$include <file>
```
Result:
(local_pandoc) dperets@dperets-mac wptest % pandoc test-include.md --filter pandoc-include -o test-include.html
[INFO] including file 'filters.md'... ok
Traceback (most recent call last):
File "/Users/dperets/git/wptest/local_pandoc/bin/pandoc-include", line 8, in <module>
sys.exit(main())
^^^^^^
File "/Users/dperets/git/wptest/local_pandoc/lib/python3.11/site-packages/pandoc_include/main.py", line 408, in main
return pf.run_filter(action, doc=doc)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/dperets/git/wptest/local_pandoc/lib/python3.11/site-packages/panflute/io.py", line 227, in run_filter
return run_filters([action], *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/dperets/git/wptest/local_pandoc/lib/python3.11/site-packages/panflute/io.py", line 208, in run_filters
doc = doc.walk(action, doc=doc, stop_if=stop_if)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/dperets/git/wptest/local_pandoc/lib/python3.11/site-packages/panflute/base.py", line 264, in walk
child = child.walk(action, doc, stop_if)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/dperets/git/wptest/local_pandoc/lib/python3.11/site-packages/panflute/containers.py", line 86, in walk
ans = list(chain.from_iterable(ans))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/dperets/git/wptest/local_pandoc/lib/python3.11/site-packages/panflute/containers.py", line 84, in <genexpr>
ans = ((item,) if type(item) is not list else item for item in ans)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/dperets/git/wptest/local_pandoc/lib/python3.11/site-packages/panflute/containers.py", line 82, in <genexpr>
ans = (item.walk(action, doc, stop_if) for item in self)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/dperets/git/wptest/local_pandoc/lib/python3.11/site-packages/panflute/base.py", line 272, in walk
altered = action(self, doc)
^^^^^^^^^^^^^^^^^
File "/Users/dperets/git/wptest/local_pandoc/lib/python3.11/site-packages/pandoc_include/main.py", line 248, in action
includeType, name, config = is_include_line(elem)
^^^^^^^^^^^^^^^^^^^^^
File "/Users/dperets/git/wptest/local_pandoc/lib/python3.11/site-packages/pandoc_include/main.py", line 87, in is_include_line
includeType, name, config = extract_info(rawString)
^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/dperets/git/wptest/local_pandoc/lib/python3.11/site-packages/pandoc_include/main.py", line 60, in extract_info
raise ValueError(f"Unable to extract info from include line {rawString}")
ValueError: Unable to extract info from include line !include `<file-doesnt-exist>`{=html}
Error running filter pandoc-include:
Filter returned error status 1
Traceback (most recent call last):
File "/Users/dperets/git/wptest/local_pandoc/bin/pandoc-include", line 8, in <module>
sys.exit(main())
^^^^^^
File "/Users/dperets/git/wptest/local_pandoc/lib/python3.11/site-packages/pandoc_include/main.py", line 408, in main
return pf.run_filter(action, doc=doc)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/dperets/git/wptest/local_pandoc/lib/python3.11/site-packages/panflute/io.py", line 227, in run_filter
return run_filters([action], *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/dperets/git/wptest/local_pandoc/lib/python3.11/site-packages/panflute/io.py", line 208, in run_filters
doc = doc.walk(action, doc=doc, stop_if=stop_if)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/dperets/git/wptest/local_pandoc/lib/python3.11/site-packages/panflute/base.py", line 264, in walk
child = child.walk(action, doc, stop_if)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/dperets/git/wptest/local_pandoc/lib/python3.11/site-packages/panflute/containers.py", line 86, in walk
ans = list(chain.from_iterable(ans))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/dperets/git/wptest/local_pandoc/lib/python3.11/site-packages/panflute/containers.py", line 84, in <genexpr>
ans = ((item,) if type(item) is not list else item for item in ans)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/dperets/git/wptest/local_pandoc/lib/python3.11/site-packages/panflute/containers.py", line 82, in <genexpr>
ans = (item.walk(action, doc, stop_if) for item in self)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/dperets/git/wptest/local_pandoc/lib/python3.11/site-packages/panflute/base.py", line 272, in walk
altered = action(self, doc)
^^^^^^^^^^^^^^^^^
File "/Users/dperets/git/wptest/local_pandoc/lib/python3.11/site-packages/pandoc_include/main.py", line 329, in action
new_doc = pf.convert_text(
^^^^^^^^^^^^^^^^
File "/Users/dperets/git/wptest/local_pandoc/lib/python3.11/site-packages/panflute/tools.py", line 487, in convert_text
out = inner_convert_text(text, in_fmt, out_fmt, extra_args, pandoc_path=pandoc_path)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/dperets/git/wptest/local_pandoc/lib/python3.11/site-packages/panflute/tools.py", line 510, in inner_convert_text
out = run_pandoc(text, args, pandoc_path=pandoc_path)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/dperets/git/wptest/local_pandoc/lib/python3.11/site-packages/panflute/tools.py", line 408, in run_pandoc
raise IOError('')
OSError
Error running filter pandoc-include:
Filter returned error status 1
Since you can run it locally, could you try running the test in this repo? You just need to clone it and run make
in the test directory? I'm not sure why it still shows an exception but maybe it is the issue of the version you installed.
Also, I've fixed the previous KeyError in case the application state is corrupted due to some reason.
@DCsunset I managed to run it, but I found two remaining issues with non-existing files:
First - it doesn't like the "< >". So this crashes:
!include <file-doesnt-exist>
File "/Users/dperets/git/wptest/local_pandoc/lib/python3.11/site-packages/pandoc_include/main.py", line 60, in extract_info
raise ValueError(f"Unable to extract info from include line {rawString}")
ValueError: Unable to extract info from include line !include `<file-doesnt-exist>`{=html}
... but this works fine:
!include file-doesnt-exist
Second - include inside fenced code blocks still has the original issue, that is, it raises IOError exception instead of the warning:
```
!include file-doesnt-exist
```
File "/Users/dperets/git/wptest/local_pandoc/lib/python3.11/site-packages/pandoc_include/main.py", line 379, in action
raise IOError(f"File not found: {name}")
OSError: File not found: file-doesnt-exist
Note: pandoc-include 1.2.0 handles both these issues successfully.
The first case is caused by a failed regex match. Do you have any idea why it fails? @studerluk
I have fixed the second case in the latest commit.
Looking at the error message I assume this happens because pandoc is treating the filename placeholder as HTML (note the added {=html}
):
Raw include line from error message: !include `<file-doesnt-exist>`{=html}
The regex pattern expects the include line to be finished after the filename which wouldn't be the case with the addition of {=html}
Of the top of my head I see two possible solutions that might resolve the issue:
{=html}
, to the regex pattern for them to be ignored.In the mean time @dmitryperets you can maybe try putting the place holder in code ticks to try to force pandoc to not treat it as HTML tag: !include `<file-doesnt-exist>`
@studerluk Thanks for looking into it. I think probably we can adopt your second approach to ignore all contents in the curly braces after the quote (still able to capture other input errors).
I don't fully yet understand the regex to make the change by myself. Do you know what \9
means in the regex? You can also submit a PR to fix it if you are willing to.
@studerluk Actually, I found a better way to solve this. The extra tags are added because it uses extended Markdown syntax. For file include, we don't want them so it's better to use markdown_strict
, which prevents adding such tags.
@dmitryperets The latest commit should fix all the above issues. Feel free to try it again.
@studerluk Thanks for looking into it. I think probably we can adopt your second approach to ignore all contents in the curly braces after the quote (still able to capture other input errors).
I don't fully yet understand the regex to make the change by myself. Do you know what
\9
means in the regex? You can also submit a PR to fix it if you are willing to.
\9
references the 9th capture group of the regex. In this case ([\`\'\"])
@DCsunset I confirm that the latest version successfully passes all my tests. Thanks!
The fix is included in v1.3.0. Closing it now.
When attempting to include a file that doesn't exist, version 1.2.0 would simply print a warning: https://github.com/DCsunset/pandoc-include/blob/a51b798b6567c53860eb681642c750c885731815/pandoc_include/main.py#L194
But version 1.2.1 throws an IOError, essentially aborting pandoc: https://github.com/DCsunset/pandoc-include/blob/140101310616d517f18e9ee97b1e489cd6daea08/pandoc_include/main.py#L261
That's too harsh, in my opinion. For example, I was trying to render a file which is an "instruction" on how to use pandoc-include itself. So there would be a line there, like
And this line is now failing the whole rendering. I believe a warning was more appropriate.