htdebeer / pandocomatic

Automate the use of pandoc
https://heerdebeer.org/Software/markdown/pandocomatic/
GNU General Public License v3.0
158 stars 14 forks source link

Running the same paru filter in Windows fails compared to Linux/macOS #83

Closed iandol closed 4 years ago

iandol commented 4 years ago

Hi Huub, hope you've kept well over the last few challenging months!

I normally use macOS or Linux, but I was recently asked about using my pandocomatic workflow on Windows. I've been converting over my filters (making sure they have .rb suffix so Windows+Pandoc knows they are scripts). But I've hit a curious problem with pandocomatic. I have WSL (so I can run Linux under windows), and have installed Pandoc, Ruby 2.7.1 and pandocomatic/paru there. I have also installed Windows versions of Ruby 2.7.1 and Pandoc separately via a windows package manager scoop. Pandoc works directly in either environment, as does Ruby. My pandoc data directory is symlinked to the correct place for Windows C:\Users\cog\AppData\Roaming\pandoc and Linux /home/cog/.local/share/pandoc -- thus the pandocomatic.yaml and filter files themselves are identical.

---
title: Simple Test3
author: Joanna Doe
pandocomatic_:
  use-template: test2
  pandoc:
    verbose: true
---

# Title
Here is some text.

filters/noop.rb

#!/usr/bin/env ruby

require 'paru/filter'

Paru::Filter.run do
  stop!
end
  test2:
    pandoc:
      from: markdown
      to: html5
      standalone: true
      self-contained: true
      mathjax: true
      filter:
        - filters/noop.rb
    metadata:
      lang: 'EN-GB'

If I run a simple test.md file via WSL everything is fine:

PS C:\Users\cog> wsl pandocomatic ./test.md
[INFO] Running filter /home/cog/.local/share/pandoc/filters/noop.rb
[INFO] Completed filter /home/cog/.local/share/pandoc/filters/noop.rb in 3 ms
Pandocomatic needed 0.3 seconds to convert '/mnt/c/Users/cog/test.md' to 'test.html'.

BUT, if I use windows powershell on exactly the same file:

PS C:\Users\cog> pandocomatic.bat ./test.md
C:/Users/cog/scoop/apps/ruby/2.7.1-1/lib/ruby/2.7.0/psych.rb:456:in `parse': (<unknown>): did not find expected ',' or '}' while parsing a flow mapping at line 1 column 1 (Psych::SyntaxError)
        from C:/Users/cog/scoop/apps/ruby/2.7.1-1/lib/ruby/2.7.0/psych.rb:456:in `parse_stream'
        from C:/Users/cog/scoop/apps/ruby/2.7.1-1/lib/ruby/2.7.0/psych.rb:390:in `parse'
        from C:/Users/cog/scoop/apps/ruby/2.7.1-1/lib/ruby/2.7.0/psych.rb:277:in `load'
        from C:/Users/cog/scoop/persist/ruby/gems/gems/paru-0.4.0.2/lib/paru/filter/metadata.rb:52:in `initialize'
        from C:/Users/cog/scoop/persist/ruby/gems/gems/paru-0.4.0.2/lib/paru/filter.rb:272:in `new'
        from C:/Users/cog/scoop/persist/ruby/gems/gems/paru-0.4.0.2/lib/paru/filter.rb:272:in `filter'
        from C:/Users/cog/scoop/persist/ruby/gems/gems/paru-0.4.0.2/lib/paru/filter.rb:244:in `run'
        from C:/Users/cog/AppData/Roaming/pandoc/filters/noop.rb:5:in `<main>'
Error running filter C:/Users/cog/AppData/Roaming/pandoc/filters/noop.rb:
Filter returned error status 1
Error running pandoc => error while running:

"pandoc"        --verbose --from="markdown" --to="html5" --standalone --self-contained --mathjax --filter="C:/Users/cog/AppData/Roaming/pandoc/filters/noop.rb"

Pandoc responded with:

C:/Users/cog/scoop/apps/ruby/2.7.1-1/lib/ruby/2.7.0/psych.rb:456:in `parse': (<unknown>): did not find expected ',' or '}' while parsing a flow mapping at line 1 column 1 (Psych::SyntaxError)
        from C:/Users/cog/scoop/apps/ruby/2.7.1-1/lib/ruby/2.7.0/psych.rb:456:in `parse_stream'
        from C:/Users/cog/scoop/apps/ruby/2.7.1-1/lib/ruby/2.7.0/psych.rb:390:in `parse'
        from C:/Users/cog/scoop/apps/ruby/2.7.1-1/lib/ruby/2.7.0/psych.rb:277:in `load'
        from C:/Users/cog/scoop/persist/ruby/gems/gems/paru-0.4.0.2/lib/paru/filter/metadata.rb:52:in `initialize'
        from C:/Users/cog/scoop/persist/ruby/gems/gems/paru-0.4.0.2/lib/paru/filter.rb:272:in `new'
        from C:/Users/cog/scoop/persist/ruby/gems/gems/paru-0.4.0.2/lib/paru/filter.rb:272:in `filter'
        from C:/Users/cog/scoop/persist/ruby/gems/gems/paru-0.4.0.2/lib/paru/filter.rb:244:in `run'
        from C:/Users/cog/AppData/Roaming/pandoc/filters/noop.rb:5:in `<main>'
Error running filter C:/Users/cog/AppData/Roaming/pandoc/filters/noop.rb:
Filter returned error status 1

This error normally comes when the metadata is incorrect, but in this case it is read fine under linux. I've tried changing between LF and CRLF and that had no effect.

Do you have any idea what could cause this error?

htdebeer commented 4 years ago

Yes. Apparently, YAML does not allow TAB characters for indentation. If you replace the TAB characters by two spaces in your test input file, the issue should be resolved.

Note that this is a pandocomatic issue, not a paru issue.

iandol commented 4 years ago

Hi, I always try to ensure my YAML has spaces not tabs, and it does in this case (whitespace visualised in VS Code, copied from the post above, also this is the case for the original test.md file):

Screenshot 2020-06-24 at 17 35 39

Yes, sorry to file this in the wrong project, the stacktrace shows paru and I didn't focus on the call to metadata...

htdebeer commented 4 years ago

When I tried to reproduce your issue on MS Window, I used tabs and got the issue. Changing to spaces resolved that and I did not look any further.

When I tried your actual example, I had to change the bottom --- to ... to get the YAML block recognized at all. That changed, I got the following error:

Pandoc responded with:

[INFO] Running filter C:/Users/.../noop.rb
Error running filter C:/Users/.../noop.rb
Could not find executable C:/Users/.../noop.rb

However, when I run the underlying pandoc command directly, thus not via pandocomatic, it finds and runs "noop.rb" just fine. I tried to resolve this by changing to an absolute path, using backslashes as path separator, but to no avail. I tried in both the cmd and the powershell, to the same effect. I tried without a template and add the filter in the test input file, but no luck there either.

As I am not a Windows user, I am not sure how to proceed to get pandocomatic to run this filter on Windows to reproduce your issue.

htdebeer commented 4 years ago

Sorry, another carelessness on my end: I named the filter noob.rb while trying to run noop.rb :-)

Unfortunately, now that it can find the filter, I do not get any error anymore running your scenario!

It works with filters in external and internal templates, in both cmd and powershell.

My environment:

tool version
ruby 2.6.6p146
pandoc 2.9.2.1
pandocomatic 0.2.7.1
paru 0.4.0.2
iandol commented 4 years ago

Huub, thanks so much for trying to reproduce. I'm away from my Windows machine now so will only be able to test the YAML separator --- when I get back to it. My Pandoc, pandocomatic and paru versions are idential to yours, and I'd be surprised if this was a Ruby issue, so the likely candidate is something about my path or environment.

How did you install Pandoc and Ruby, using standard installers or a package manager like scoop?

htdebeer commented 4 years ago

I did use the default Windows pandoc installer and the RubyInstaller for Ruby. Then I installed pandocomatic via Gem. The only other difference is using the --config option with pandocomatic to point to a config file with your template and the noop.rb script was in the same directory as the input markdown file. So, maybe something is going awry with the default pandoc / pandocomatic configuration directory.

iandol commented 4 years ago

Was the RubyInstaller package you used with or without MSYS2 (the system to build native gems in Windows)? It appears https://scoop.sh does fetch the same RubyInstaller but uses the small 7zip package without MSYS2, so perhaps this causes the incompatibility.

Anyway, it seems this has nothing to do with pandocomatic/paru, but either the pandoco data dir and/or the ruby package, so I'll go ahead and close. I will update here when/if I solve this on my machine. Thanks for your helpful input!

htdebeer commented 4 years ago

I used the recommended RubyInstaller package that included the build tools. It would be frustrating if a standard library module, which Psych is, would not work without these build tools.

iandol commented 4 years ago

Well, I cannot seem to solve this. I found another Windows machine to test, which had Ruby and Pandoc installed via standard installers, and I couldn't reproduce. I tried removing them (i.e. no msys2) and using scoop to install, and couldn't reproduce. No difference if I use symlinks to pandoc data dir or same-folder files.

Back on my broken machine, here are the source files (LF line encoding, but trying CRLF no difference):

test.zip

Still failing, I then installed msys2 and reinstalled ruby and updated all gems and reinstalled paru and pandocomatic. Still failing. Installed the standard installers, Still failing. If I comment out the filter: .\noop.rb line then pandocomatic works fine, it is only when pandocomatic calls a filter that this fails. I have validated the YAML (both the config file and markdown metadata) and there are no errors. There is something that is causing psych to fail selectively. I searched for that ruby error, but most links are solved by correctly formatting the YAML, which is not the issue here...

I don't have time to investigate this further, it will remain a mystery... Thanks Huub!