Closed zeryx closed 8 years ago
Pandoc cannot read doc
files, only docx
and odt
. Can you open the files in question in word or open/libre office? Can you post a link to them?
And pandoc only started being able to read docx
and odt
files in the last couple of years. What version (pandoc -v
) are you running?
Nevermind, docx
and odt
are viable parse from & parse to formats, I needed to pass the -f
and -t
flags to have it properly function.
just FYI, my output from pandoc -v
is:
$ pandoc -v
pandoc 1.16.0.2
Compiled with texmath 0.8.4.1, highlighting-kate 0.6.1.
Syntax highlighting is supported for the following languages:
abc, actionscript, ada, agda, apache, asn1, asp, awk, bash, bibtex, boo, c,
changelog, clojure, cmake, coffee, coldfusion, commonlisp, cpp, cs, css,
curry, d, diff, djangotemplate, dockerfile, dot, doxygen, doxygenlua, dtd,
eiffel, email, erlang, fasm, fortran, fsharp, gcc, glsl, gnuassembler, go,
haskell, haxe, html, idris, ini, isocpp, java, javadoc, javascript, json,
jsp, julia, kotlin, latex, lex, lilypond, literatecurry, literatehaskell,
llvm, lua, m4, makefile, mandoc, markdown, mathematica, matlab, maxima,
mediawiki, metafont, mips, modelines, modula2, modula3, monobasic, nasm,
noweb, objectivec, objectivecpp, ocaml, octave, opencl, pascal, perl, php,
pike, postscript, prolog, pure, python, r, relaxng, relaxngcompact, rest,
rhtml, roff, ruby, rust, scala, scheme, sci, sed, sgml, sql, sqlmysql,
sqlpostgresql, tcl, tcsh, texinfo, verilog, vhdl, xml, xorg, xslt, xul,
yacc, yaml, zsh
Default user data directory: /home/james/.pandoc
Copyright (C) 2006-2015 John MacFarlane
Web: http://pandoc.org
This is free software; see the source for copying conditions.
There is no warranty, not even for merchantability or fitness
for a particular purpose.
My version was taken directly from the ubuntu 16.04 dep cache.
trying to convert ta locally made doc file (made via openoffice) to either html or markdown, both conversions fail due to:
pandoc: Cannot decode byte '\xff': Data.Text.Encoding.Fusion.streamUtf8: Invalid UTF-8 stream
file -bi charset output:
iconv -f binary -t utf-8 output:
Therefore it seems like there is no possible way to use a
doc
file with pandoc, unless I'm missing something critical, any help would be greatly appreciated!PS: this also fails on parsing odt files & docx files output from openoffice -> save as, all other formats work fine