JessicaTegner / pypandoc

Thin wrapper for "pandoc" (MIT)
http://pypi.python.org/pypi/pypandoc/
Other
872 stars 111 forks source link

pypandoc ignores lua filters #367

Open knauthe opened 3 months ago

knauthe commented 3 months ago

I'm using pypandoc to use pandoc in python to convert html to markdown I'm trying to make pypandoc apply lua filters. But non of the filters are applied. I put checkpoints at every function head and none of them are visited, so the filters aren't even called, even though pypandoc registers them. I know that pypandoc registers the filters, because when I put some syntax error into one of my lua scripts, it is specifically my pypandoc call that throws an error, specifically referencing that lua script.

Here is my pypandoc call:


filters = ["lua/" + f for f in os.listdir("lua")]
md = pypandoc.convert_file(source_file=sourcepath, format='html', to='gfm-raw_html', filters = filters)

Here are some of my filters:

lua/Nav.lua:


function Nav(el)
    return {}
end

lua/Img.lua:


function Img(el)
    if el.attributes.alt == "question_mark" then
        return ' "?" '
    else
        return {}
    end
end

What I tried: Using lua filters in pypandoc.

Expected behavior: The filters get applied.

Actual behavior: The filters are ignored. Pypandoc converts files without applying any filters. No errors are thrown.

JessicaTegner commented 3 months ago

Hi @knauthe that seems very strange. Can you try the following

If that still doesn't work, could you confirm your version of pandoc (by running pandoc -v) and your version of pypandoc?

Thanks for raising this issue.

knauthe commented 3 months ago

So apparently pypandoc was not correctly installed. I ran pip show pypandoc and got an error, so I ran pip install pypandoc again. The strange thing is that it still worked before, just without the filters. Anyway, now the filters seam to run, at least in principle. Thank you.

Also I noticed an issue with text wrapping. According to the Pandoc manual the default is supposed to be --wrap=none, but in pypandoc I had to set extra_args=['--wrap=none'] manually. Also a lot of empty lines appear out of nowhere when feeding the output of convert_file into a string before writing to file. This doesn't happen when outputfile is set.