msprev / panzer

pandoc + styles
BSD 3-Clause "New" or "Revised" License
160 stars 13 forks source link

Panzer doesn't run on Windows #20

Closed mickley closed 8 years ago

mickley commented 9 years ago

Using latest version on Win7x64 + Python 3.5.

When running panzer, I get

failed to create process

I've added the python directories to the path, but this doesn't seem to help. Any ideas?

msprev commented 9 years ago

Hmm, strange. I don't have Windows and haven't tested it myself, but other users report that it works ok.

Do you want to try this alternative install method below? I recently changed it to the simpler pip3 method, but maybe this is what is causing problems for you.

        git clone https://github.com/msprev/panzer
        cd panzer
        python3 setup.py install

Let me know if this doesn't work.

mickley commented 9 years ago

That seems to have fixed it, thanks.

Using the pip3 install, "panzer" does not work, but "panzer-script.py" does

mickley commented 9 years ago

I'm still having problems on Windows, though panzer is installed now.

It's having problems finding the global styles.yaml file. I created a .panzer directory under C:\Users\James, and tried putting the styles.yaml in the root of that directory and in the styles directory. Both cases give me the warning "no global style definitions found"

I took a look at your code, and it appears though that it's the right folder. If I remove it and run panzer, panzer asks to create the folder for me, creates it, and puts a styles.yaml file in root, bot does not find it. So I think there's some problem with load_styledef()

Using a local styles.yaml file doesn't seem to work either.

Finally, in addition (or perhaps because of) the previous problem, I'm getting this error:

ERROR !pandoc: Could not parse YAML header: did not find expected hexdecimal number "source" (line 10, column 32).

The error persists regardless of what line 10 is in the markdown source.

msprev commented 9 years ago

I've updated the install instructions to add the non-pip method back. Thanks for reporting that this works.

I've also pushed a fix to the way in which panzer creates a blank .panzer directory. It was creating an old structure that is deprecated in the current version of panzer. You may want to pull the latest version and rerun the .panzer creation process.

Could you please include add to this issue a minimal working example of (a) the source markdown document being passed (b) the contents of the global styles.yaml file? I'll then be able to work out what is going on with the second error you report!

mickley commented 9 years ago

Ok, some updates:

First, I figured out why styles.yaml was not being found, there was a parse error. In short, there's a problem with metadata fields containing Windows paths.

"C:\Users\James\Desktop\Sample" <--- This prevents styles.yaml from loading
"C:\\Users\\James\\Desktop\\Sample" <--- so does this
"`C:\Users\James\Desktop\Sample`" <--- And this
'C:\Users\James\Desktop\Sample' <--- And this
"C:/Users/James/Desktop/Sample" <--- But this works, even though it's not the normal way for Windows

Secondly, I'm still getting the same error about parsing the YAML header as detailed previously.

A minimal setup:

Markdown

---
style: Base
...

# Heading 1

## Heading 2

styles.yaml

Base:
    all:
        commandline:
            smart: true
            standalone: true
        filter:
            - run: pandoc-crossref
            - run: pandoc-citeproc

The debug output

2015-10-02 21:54:10,688 - DEBUG -          loading global style definitions file
2015-10-02 21:54:10,688 - DEBUG -          run "pandoc - --write json --output -"
2015-10-02 21:54:10,719 - DEBUG -          loading local style definitions file
2015-10-02 21:54:10,719 - DEBUG -          run "pandoc - --write json --output -"
2015-10-02 21:54:10,757 - ERROR - ERROR:     !pandoc: Could not parse YAML header: did not find expected hexdecimal number "source" (line 10, column 32)

2015-10-02 21:54:10,757 - INFO -          ----- pandoc read -----
2015-10-02 21:54:10,757 - DEBUG -          loading source document(s)
2015-10-02 21:54:10,757 - DEBUG -          run "pandoc example.md --read markdown --write json --output -"
2015-10-02 21:54:10,757 - INFO -          running
2015-10-02 21:54:10,794 - INFO -          ----- style definitions -----
2015-10-02 21:54:10,794 - INFO -          global:
2015-10-02 21:54:10,794 - INFO -            Base        
2015-10-02 21:54:10,794 - DEBUG -          field "styledef" not found
2015-10-02 21:54:10,794 - INFO -          ----- document style -----
2015-10-02 21:54:10,794 - INFO -          style:
2015-10-02 21:54:10,794 - INFO -            Base
2015-10-02 21:54:10,794 - INFO -          full hierarchy:
2015-10-02 21:54:10,794 - INFO -            Base
2015-10-02 21:54:10,794 - INFO -          writer:
2015-10-02 21:54:10,794 - INFO -            docx
2015-10-02 21:54:10,794 - DEBUG -          field "template" not found
2015-10-02 21:54:10,794 - INFO -          ----- pandoc read with metadata options -----
2015-10-02 21:54:10,794 - INFO -          pandoc reading with options:
2015-10-02 21:54:10,794 - INFO -            --smart
2015-10-02 21:54:10,826 - ERROR - ERROR:     !pandoc: Could not parse YAML header: did not find expected hexdecimal number "source" (line 10, column 32)

2015-10-02 21:54:10,841 - INFO -          ----- run list -----
2015-10-02 21:54:10,841 - INFO -          filter:
2015-10-02 21:54:10,841 - INFO -           1  pandoc-crossref "pandoc-crossref"
2015-10-02 21:54:10,841 - INFO -           2  pandoc-citeproc "pandoc-citeproc"
2015-10-02 21:54:10,841 - INFO -          ----- filter -----
2015-10-02 21:54:10,841 - INFO -          [1/2] pandoc-crossref docx
2015-10-02 21:54:10,875 - INFO -          [2/2] pandoc-citeproc docx
2015-10-02 21:54:10,957 - INFO -          ----- pandoc write -----
2015-10-02 21:54:10,957 - INFO -          pandoc writing with options:
2015-10-02 21:54:10,957 - INFO -            --standalone
mickley commented 9 years ago

Maybe I spoke too soon. Adding the following to metadata in Base results in styles.yaml being recognized, but the citation style isn't used and throws an error

csl: "C:/Users/James/.panzer/references/harvard.csl"
2015-10-02 22:06:30,825 - ERROR - ERROR:     !pandoc-citeproc: InvalidUrlException "C:/Users/James/.panzer/references/harvard.csl" "Invalid scheme"
2015-10-02 22:06:30,825 - ERROR - ERROR:   failed to receive json object from filter---skipping filter
msprev commented 9 years ago

Thanks, this is arising because pandoc parses all metadata field values as markdown. The slashes in the Windows path are being wrongly parsed as italics. There is also an issue with the colon, as the yaml parser wants to read this as a field delimiter. Try this instead:

csl: 'C:\/Users\/James\/.panzer\/references\/harvard.csl'

Here, the slashes are all escaped. The value is quoted with single quotes so the colon is correctly handled by the yaml parser.

Note that this error is being raised by pandoc rather than panzer. (Try running just vanilla pandoc on a document with that csl field). This is something that I've had to work around using backtick quotes for passing arguments to scripts and filters. The ideal solution would be to allow a way for the markdown parser to be selectively disabled for certain metadata fields. You can see the relevant pandoc issue proposing this here: https://github.com/jgm/pandoc/issues/2139

Can you let me know if this resolves the issue?

mickley commented 9 years ago

I tried this, and while it ensures that styles.yaml is parsed, pandoc won't handle it and throws the same InvalidURLException.

It seems that both the InvalidURLException and Could not parse YAML header errors are due to pandoc having problems with the format of a path in a YAML metadata field.

That said, this is a problem for panzer as well, as if an improper path format is specified, the whole styles.yaml file won't be parsed. Given that any small syntax error will kill the ability to parse styles.yaml, it might be useful to give a more informative error to the user, ideally with the line that triggered the problem. It took me quite a while to figure out that this was even the problem and that the .panzer directory was in the right spot.

While pandoc itself seems to have problems with csl and bibliography paths in metadata, I can run vanilla pandoc using --bibliography and --csl flags and it works fine. But as soon as I use panzer, it won't work. If csl is in a metadata field, then pandoc throws an error even if panzer can parse styles.yaml.

The example below (with or without backticks) keeps panzer happy, but pandoc still complains about the YAML header. It appears to run with the csl commandline flag, but uses the default Chicago style instead.

Base:
    all:
        commandline:
            csl: '`C:\/Users\/James\/.panzer\/references\/harvard.csl`'
        filter:
            - run: pandoc-citeproc
2015-10-04 12:44:27,860 - DEBUG -          loading global style definitions file
2015-10-04 12:44:27,860 - DEBUG -          run "pandoc - --write json --output -"
2015-10-04 12:44:27,878 - DEBUG -          loading local style definitions file
2015-10-04 12:44:27,878 - DEBUG -          run "pandoc - --write json --output -"
2015-10-04 12:44:27,905 - ERROR - ERROR:     !pandoc: Could not parse YAML header: did not find expected hexdecimal number "source" (line 10, column 32)
2015-10-04 12:44:27,905 - INFO -          ----- pandoc read -----
2015-10-04 12:44:27,905 - DEBUG -          loading source document(s)
2015-10-04 12:44:27,905 - DEBUG -          run "pandoc sample.md --read markdown --write json --output -"
2015-10-04 12:44:27,905 - INFO -          running
2015-10-04 12:44:27,961 - INFO -          ----- style definitions -----
2015-10-04 12:44:27,961 - INFO -          global:
2015-10-04 12:44:27,961 - INFO -            Base        
2015-10-04 12:44:27,961 - DEBUG -          field "styledef" not found
2015-10-04 12:44:27,961 - INFO -          ----- document style -----
2015-10-04 12:44:27,961 - INFO -          style:
2015-10-04 12:44:27,961 - INFO -            Base
2015-10-04 12:44:27,961 - INFO -          full hierarchy:
2015-10-04 12:44:27,961 - INFO -            Base
2015-10-04 12:44:27,961 - INFO -          writer:
2015-10-04 12:44:27,961 - INFO -            docx
2015-10-04 12:44:27,961 - DEBUG -          field "template" not found
2015-10-04 12:44:27,977 - INFO -          ----- run list -----
2015-10-04 12:44:27,977 - INFO -          filter:
2015-10-04 12:44:27,977 - INFO -           1  pandoc-citeproc "pandoc-citeproc"
2015-10-04 12:44:27,977 - INFO -          ----- filter -----
2015-10-04 12:44:27,977 - INFO -          [1/1] pandoc-citeproc docx
2015-10-04 12:44:27,977 - DEBUG -          run "pandoc-citeproc docx"
2015-10-04 12:44:28,135 - INFO -          ----- pandoc write -----
2015-10-04 12:44:28,135 - INFO -          pandoc writing with options:
2015-10-04 12:44:28,135 - INFO -            --csl=C:\/Users\/James\/.panzer\/references\/harvard.csl
2015-10-04 12:44:28,135 - DEBUG -          run "pandoc - --read json --write docx --output Sample.docx --csl=C:\/Users\/James\/.panzer\/references\/harvard.csl"
2015-10-04 12:44:28,300 - DEBUG -          output to binary file by pandoc
2015-10-04 12:44:28,316 - DEBUG -          >>>>>>>>>> panzer quits <<<<<<<<<<
msprev commented 9 years ago

Ah. I didn't realise that you wanted to use csl as a command line option. I thought it was just a metadata field. In that case, you need:

Base:
    all:
        commandline:
            csl: "`C:\Users\James\.panzer\references\harvard.csl`"
        filter:
            - run: pandoc-citeproc

This should work. panzer uses backticks to pass literal values for all command line arguments as a special case and to get around the problem of pandoc's parser treating all yaml value as markdown. This is a dirty hack to disable markdown parsing, but better than asking the user to always escape command line arguments (which one pretty much never wants treated as markdown).

I completely agree that this is a problem and confusing for users of both pandoc and panzer. I don't see that there is much I can do about it from this side though. My only options are (1) do the yaml parsing inside panzer and permit a literal syntax -- which would introduce likely confusing incompatibilities with yaml handling in pandoc; (2) wait for a literal syntax such as that proposed in the pandoc issue to be implemented in pandoc -- panzer will then automatically take advantage of this.

Please let me know if the suggested fields above work!

mickley commented 9 years ago

Nope, I tried that and it doesn't work.

Double quoting a Windows path even with backticks results in panzer not parsing the entire styles.yaml.

Panzer only works with the single quotes and escaped slashes. In that case, the backticks seem unnecessary. But it doesn't seem to work on pandoc's side.

I'd prefer this to work in metadata, rather than commandline, but I suppose either gets the job done.

msprev commented 9 years ago

My bad. I missed that even inside backticks you need to escape blackslashes. The following works fine for me:

commandline:
    csl: "`C:\\Users\\James\\.panzer\\references\\harvard.csl`"

Could you let me know if this does it?

mickley commented 9 years ago

No dice I'm afraid. Everything seems to parse correctly for panzer and pandoc. But the style is the default Chicago again, so pandoc hasn't used it.

2015-10-04 23:59:16,606 - INFO -          pandoc writing with options:
2015-10-04 23:59:16,606 - INFO -            --csl=C:\Users\James\.panzer\references\harvard.csl
mickley commented 9 years ago

Also, still getting the weird error from pandoc, even with a minimal example:

2015-10-04 23:59:16,185 - ERROR - ERROR:     !pandoc: Could not parse YAML header: did not find expected hexdecimal number "source" (line 10, column 32)

Markdown document

---
style: Base
...

# Heading 1

## Heading 2

styles.yaml

Base:
    all:
        commandline:
            smart: true
msprev commented 9 years ago

Ok, let's try to work out what is going on with your minimal example first. It works fine on my computer, so there is something funny going on. One thought is that perhaps you have a mix of tab characters and whitespace in your yaml blocks? That really screws up the pandoc yaml parser, so you want to makes sure that there are no tabs, only whitespace.

Could you save the following file as test.md:

---
style: Base
...

# Heading 1

## Heading 2

And the following file in the same directory as styles.yaml:

Base:
    all:
        commandline:
            smart: true

Now run the commands:

panzer test.md
pandoc test.md

Both should produce the following output to stdout:

<h1 id="heading-1">Heading 1</h1>
<h2 id="heading-2">Heading 2</h2>
msprev commented 8 years ago

I haven't heard anything further on this issue. Please let me know if the issue is unresolved and I'll reopen it.

mickley commented 8 years ago

Sorry for disappearing.

Yes, everything seems to be working now, and I'm unclear as to what fixed it. For anyone else having the same trouble, here's what I did that worked:

commandline:
    csl: "`C:\\Users\\James\\.panzer\\references\\harvard.csl`"

You can also specify CSL and citation abbreviation files without the full path. Pandoc-citeproc will look for them in the current directory first, but then will look in %USERPROFILE%\AppData\Roaming\csl (or $HOME/.csl on unix).

I don't think this will work with other files such as bibliography or reference-docx though.

msprev commented 8 years ago

Great! I'm glad to hear that it is working ok! The tabs vs spaces one is a common gotcha with yaml files. It happens to me all the time.