Closed pete-mckinney closed 12 years ago
This error means that your input is not UTF-8 encoded. It has nothing to do with the specific content.
Use your text editor, or iconv or another tool to convert the text to UTF-8.
The next version of pandoc will have a more informative error message for this!
+++ pete-mckinney [Sep 27 12 06:57 ]:
I have some source material that I'm trying to convert into an epub. It uses punctuation that is giving pandoc fits.
Here is a sample:
Chapter 1
This is some text. Doesn't like an em dash -- does not. "Also does not like fancy quotes."
This gives the following error:
pandoc: test2.txt: hGetContents: invalid argument (invalid byte sequence)
It would be nice if pandoc would consume this. It would also be nice if pandoc would give the line number and column of the data that it considers invalid.
Thanks!
-- Reply to this email directly or [1]view it on GitHub. [J6T91GIPIyhU-8ti4GCGP7AlC2fiocPKodp06RQqyLwCg2EfXx-FUz_KkN6q41LD.gif]
References
That did the trick- thanks!
On Thu, Sep 27, 2012 at 11:57 AM, John MacFarlane notifications@github.comwrote:
This error means that your input is not UTF-8 encoded. It has nothing to do with the specific content.
Use your text editor, or iconv or another tool to convert the text to UTF-8.
The next version of pandoc will have a more informative error message for this!
+++ pete-mckinney [Sep 27 12 06:57 ]:
I have some source material that I'm trying to convert into an epub. It uses punctuation that is giving pandoc fits.
Here is a sample:
Chapter 1
This is some text. Doesn't like an em dash -- does not. "Also does not like fancy quotes."
This gives the following error:
pandoc: test2.txt: hGetContents: invalid argument (invalid byte sequence)
It would be nice if pandoc would consume this. It would also be nice if pandoc would give the line number and column of the data that it considers invalid.
Thanks!
Reply to this email directly or [1]view it on GitHub. [J6T91GIPIyhU-8ti4GCGP7AlC2fiocPKodp06RQqyLwCg2EfXx-FUz_KkN6q41LD.gif]
References
— Reply to this email directly or view it on GitHubhttps://github.com/jgm/pandoc/issues/628#issuecomment-8941234.
I know this bug is a couple of years old, but I'm having the same problem and iconv doesn't fix it. Even more interesting is that the document I'm getting this error is one that pandoc itself created! I took a LaTeX document, converted to ODT, MD, DOCX, and a few other formats for testing, and then tried to convert them back into LaTeX. Nogo! Every format, all produced by pandoc itself, exits with the same error:
pandoc:
(Note, the "
Ideas folks?
(Note: Running Ubuntu 12.04lts)
Please can you post your input if possible? If not then a minimum example which highlights the problem?
Making a new issue wouldn't be a bad idea either.
Also what do
Pandoc --version
and
locale
report?
On Aug 13, 2014, at 2:57 PM, mpickering notifications@github.com wrote:
Please can you post your input if possible? If not then a minimum example which highlights the problem?
— Reply to this email directly or view it on GitHub.
Thanks for getting back to me so promptly John.
First, let me correct the record. I mistakenly included markdown in my list, and should not have. Pandoc reverse processed the md formatted file just fine. Moving on...
Okay... 'pandoc --version' returns:
pandoc 1.9.1.1
Compiled with citeproc-hs 0.3.4, texmath 0.6.0.3, highlighting-kate
0.5.0.5.
Syntax highlighting is supported for the following languages:
Actionscript, Ada, Alert, Alert_indent, Apache, Asn1, Asp, Awk, Bash, Bibtex, Boo, C, Changelog, Clojure, Cmake, Coffeescript, Coldfusion, Commonlisp, Cpp, Cs, Css, D, Diff, Djangotemplate, Doxygen, Dtd,
Eiffel,
Email, Erlang, Fortran, Fsharp, Gnuassembler, Go, Haskell, Haxe, Html,
Ini,
Java, Javadoc, Javascript, Json, Jsp, Latex, Lex, LiterateHaskell, Lua, Makefile, Mandoc, Matlab, Maxima, Metafont, Mips, Modula2, Modula3, Monobasic, Nasm, Noweb, Objectivec, Objectivecpp, Ocaml, Octave,
Pascal,
Perl, Php, Pike, Postscript, Prolog, Python, R, Relaxngcompact, Rhtml,
Ruby,
Scala, Scheme, Sci, Sed, Sgml, Sql, SqlMysql, SqlPostgresql, Tcl,
Texinfo,
Verilog, Vhdl, Xml, Xorg, Xslt, Xul, Yacc, Yaml
Copyright (C) 2006-2012 John MacFarlane
Web: http://johnmacfarlane.net/pandoc
This is free software; see the source for copying conditions. There is no
warranty, not even for merchantability or fitness for a particular purpose.
This is, I hope the latest version, as it's the one Canonical has up. :)
locale reports:
LANG=en_US.UTF-8
LANGUAGE=
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=
As it should (unless someone has been mucking with my system settings!).
As for posting my input, the LaTeX file is a book length manuscript, so let me create a smaller file and put that up here in a bit.
The latest version is 1.12.4! and we're going to release 1.13 in the very very near future. Maybe try with an updated version first?
Yeah I'm serious! WTF is up with Ubuntu and Canonical?!?!?
Would you happen to have a URL for a private repo I can add for apt to get your updates automagically?
-=Michael=- Metaphor Publications
Add me to your address book: http://ourteam.com/mjmatson
On Thu, Aug 14, 2014 at 11:26 AM, mpickering notifications@github.com wrote:
The latest version is 1.12.4! and we're going to release 1.13 in the very very near future. Maybe try with an updated version first?
— Reply to this email directly or view it on GitHub https://github.com/jgm/pandoc/issues/628#issuecomment-52222440.
Been doing some Google cruzing. I haven't found an apt repo yet, but I have found evidence that Ubuntu isn't doing a very good job at all of keeping pandoc up to date: http://stackoverflow.com/questions/24863160/trouble-with-pandoc-installation-on-ubuntu-14-04lts-for-using-with-r-markdown
(Note: I'm running 12.04lts, not 14.04lts (we want to finish this book series before upgrading), but IMO that only makes the above problem worse, as 14.04 was just released! It should have the latest version of your program!
-=Michael=- Metaphor Publications
Add me to your address book: http://ourteam.com/mjmatson
On Thu, Aug 14, 2014 at 11:26 AM, mpickering notifications@github.com wrote:
The latest version is 1.12.4! and we're going to release 1.13 in the very very near future. Maybe try with an updated version first?
— Reply to this email directly or view it on GitHub https://github.com/jgm/pandoc/issues/628#issuecomment-52222440.
I have some source material that I'm trying to convert into an epub. It uses punctuation that is giving pandoc fits.
Here is a sample:
This gives the following error:
It would be nice if pandoc would consume this. It would also be nice if pandoc would give the line number and column of the data that it considers invalid.
Thanks!