ftilmann / latexdiff

Compares two latex files and marks up significant differences between them. Releases on www.ctan.org and mirrors
GNU General Public License v3.0
514 stars 72 forks source link

Error 'Invalid UTF-8 byte sequence' when using non-ascii characters in verbatim block #304

Closed anka-213 closed 1 month ago

anka-213 commented 1 month ago

MWE:

\documentclass{article}
\usepackage[utf8]{inputenc}
\begin{document}
\begin{verbatim}
Test åäö.
\end{verbatim}
\end{document}
\documentclass{article}
\usepackage[utf8]{inputenc}
\begin{document}
\begin{verbatim}
Test åäö.
Abc
\end{verbatim}
\end{document}

This generates

\documentclass{article}
%DIF LATEXDIFF DIFFERENCE FILE
%DIF DEL verbatim-nonascii-1.tex   Mon Jul 15 21:28:35 2024
%DIF ADD verbatim-nonascii-2.tex   Mon Jul 15 21:30:02 2024
\usepackage[utf8]{inputenc}
%DIF PREAMBLE EXTENSION ADDED BY LATEXDIFF
%DIF UNDERLINE PREAMBLE %DIF PREAMBLE
\RequirePackage[normalem]{ulem} %DIF PREAMBLE
\RequirePackage{color}\definecolor{RED}{rgb}{1,0,0}\definecolor{BLUE}{rgb}{0,0,1} %DIF PREAMBLE
\providecommand{\DIFadd}[1]{{\protect\color{blue}\uwave{#1}}} %DIF PREAMBLE
\providecommand{\DIFdel}[1]{{\protect\color{red}\sout{#1}}}                      %DIF PREAMBLE
%DIF SAFE PREAMBLE %DIF PREAMBLE
\providecommand{\DIFaddbegin}{} %DIF PREAMBLE
\providecommand{\DIFaddend}{} %DIF PREAMBLE
\providecommand{\DIFdelbegin}{} %DIF PREAMBLE
\providecommand{\DIFdelend}{} %DIF PREAMBLE
\providecommand{\DIFmodbegin}{} %DIF PREAMBLE
\providecommand{\DIFmodend}{} %DIF PREAMBLE
%DIF FLOATSAFE PREAMBLE %DIF PREAMBLE
\providecommand{\DIFaddFL}[1]{\DIFadd{#1}} %DIF PREAMBLE
\providecommand{\DIFdelFL}[1]{\DIFdel{#1}} %DIF PREAMBLE
\providecommand{\DIFaddbeginFL}{} %DIF PREAMBLE
\providecommand{\DIFaddendFL}{} %DIF PREAMBLE
\providecommand{\DIFdelbeginFL}{} %DIF PREAMBLE
\providecommand{\DIFdelendFL}{} %DIF PREAMBLE
%DIF COLORLISTINGS PREAMBLE %DIF PREAMBLE
\RequirePackage{listings} %DIF PREAMBLE
\RequirePackage{color} %DIF PREAMBLE
\lstdefinelanguage{DIFcode}{ %DIF PREAMBLE
%DIF DIFCODE_UNDERLINE %DIF PREAMBLE
  moredelim=[il][\color{red}\sout]{\%DIF\ <\ }, %DIF PREAMBLE
  moredelim=[il][\color{blue}\uwave]{\%DIF\ >\ } %DIF PREAMBLE
} %DIF PREAMBLE
\lstdefinestyle{DIFverbatimstyle}{ %DIF PREAMBLE
    language=DIFcode, %DIF PREAMBLE
    basicstyle=\ttfamily, %DIF PREAMBLE
    columns=fullflexible, %DIF PREAMBLE
    keepspaces=true %DIF PREAMBLE
} %DIF PREAMBLE
\lstnewenvironment{DIFverbatim}{\lstset{style=DIFverbatimstyle}}{} %DIF PREAMBLE
\lstnewenvironment{DIFverbatim*}{\lstset{style=DIFverbatimstyle,showspaces=true}}{} %DIF PREAMBLE
%DIF END PREAMBLE EXTENSION ADDED BY LATEXDIFF

\begin{document}
\DIFmodbegin
\begin{DIFverbatim}[alsolanguage=DIFcode]
Test åäö.
%DIF > Abc
\end{DIFverbatim}
\DIFmodend
\end{document}

The first two tex-files compile just fine, but when compiling the generated code I get the error:

! LaTeX Error: Invalid UTF-8 byte sequence (�\lst@EC�).
anka-213 commented 1 month ago

Oh, I see, the issue is apparently with the listings package not being compatible with utf8 inputenc: https://tex.stackexchange.com/questions/24528/having-problems-with-listings-and-utf-8-can-it-be-fixed :(

anka-213 commented 1 month ago

I managed to get it working with this version:

\documentclass{article}
\usepackage[utf8]{inputenc}
\usepackage{listings}
\lstset{
extendedchars=\true,
inputencoding=utf8
}
\begin{document}
\begin{verbatim}
Test åäö.
Abc
\end{verbatim}
\end{document}
ftilmann commented 1 month ago

Thanks for reporting and basically solving this, also. This fix is embedded by the commit.