jgm / pandoc

Universal markup converter
https://pandoc.org
Other
34.69k stars 3.39k forks source link

UTF8 Decoding issue, Please fix This bug! #9379

Closed AkarinLiu closed 9 months ago

AkarinLiu commented 9 months ago

Explain the problem. When I type pandoc a.md --template template.docx -o b.docx, Pandoc print error Messages:

UTF-8 decoding error in .\template.docx at byte offset 10 (f0).
The input must be a UTF-8 encoded text.

Same error at using pandoc generate default template.

Pandoc version?

pandoc.exe 3.1.11.1
Features: +server +lua
Scripting engine: Lua 5.4
User data directory: C:\Users\AkarinLiu\AppData\Roaming\pandoc
Copyright (C) 2006-2023 John MacFarlane. Web: https://pandoc.org
This is free software; see the source for copying conditions. There is no
warranty, not even for merchantability or fitness for a particular purpose.
jgm commented 9 months ago

Well, is the input UTF-8 encoded? If it isn't, then this isn't a bug. Pandoc (as documented) expects its templates to be UTF-8 encoded plain text.

It looks like you're trying to use a docx file as a template. Please see the documentation for --reference-docx.

AkarinLiu commented 9 months ago

But pandoc output Is not UTF-8 Encoding, even Enable Global UTF-8 Support at Windows, Maybe Is a bug.

silverhook commented 2 months ago

I can confirm this bug on Pandoc 3.1.9 (on Linux)

I tried it with --template and an .otp and had the following error:

UTF-8 decoding error in ../predloge/SPOZNAJ 2024 template.otp at byte offset 14 (ef).
The input must be a UTF-8 encoded text.

I opened up the ODF as Zip and checked the XML inside it – everything I checked both stated UTF-8 as encoding in the XML headers, and the files were also identified as UTF-8-encoded by the text editor (Kate in my case).

Oddly enough, I tried doing the same with a .pptx version of the same file and --reference-doc and that one worked.

jgm commented 2 months ago

See above: the argument of --template needs to be a text file in pandoc's template syntax, and it needs to be UTF-8. You can't use any kind of binary file, including a zip.

silverhook commented 2 months ago

@jgm , ah, OK, I misunderstood then. I apologise.