ftilmann / latexdiff

Compares two latex files and marks up significant differences between them. Releases on www.ctan.org and mirrors
GNU General Public License v3.0
513 stars 72 forks source link

Not handling siunitx on tables, the diff macros are not detected as numbers #238

Open adinriv opened 2 years ago

adinriv commented 2 years ago

There is a problem when diff'ing tables that use the siunitx S column type.

The siunitx package doesn't see the added macros as part of the number, and it prompts

Package siunitx Error: Invalid number '.55('. ...DIFaddbeginFL \DIFaddFL{4}\DIFaddendFL ) &

Is there a way to make the diff macros to be treated as markup? Or to wrap them around the whole cell? However, that doesn't fix the error of not been detected as a number.

Here is a MWE to show the issue.

v1.tex:

\documentclass{article}

\usepackage{siunitx}
\sisetup{
  table-format=1.2(1),
  separate-uncertainty,
}

\begin{document}
\begin{table}
  \begin{tabular}{SS}
    .55(5) & .13(1)
  \end{tabular}
\end{table}
\end{document}

v2.tex:

\documentclass{article}

\usepackage{siunitx}
\sisetup{
  table-format=1.2(1),
  separate-uncertainty,
}

\begin{document}
\begin{table}
  \begin{tabular}{SS}
    .55(4) & .45(2)
  \end{tabular}
\end{table}
\end{document}
ftilmann commented 2 years ago

I know this issue is old but I am just working through backlog of issues. Thank you for supplying very clear MWE. The problem here is that latexdiff thinks that (, ) are just normal text, while they have special meaning in the particular context of an siunitx table column. As most text will contain parentheses is normal text also, preprocessing would need to turn this syntax into something the core algorithm is able to understand:

  1. Detecting when we are in a S type column.
  2. Transforming something like .55(4) into a pseudocommand \SITABLEUNCERTAINTY{.55(4)} and declare \SITABLEUNCERTAINTY as safe command and in post-processing:
  3. Turn \SITABLEUNCERTAINTY{.55(4)} back into the standard form

This would mean that these table entries would be treated atomistically, i.e. if either value or uncertainty changed, the whole number plus uncertainty would show up - I guess that would be acceptable.

I experimented with placing \DIFadd/del commands around just the number or just the uncertainty but this did not work. Now 1 would be quite tricky with the current setup. But I could propose to detect any instance of regular patterns [0-9]*.[0-9]*\([0-9+\) and transform as in steps 2, 3 above. The pattern should occur rarely, if ever, in normal text, and even if it does it's not dramatic.
Before I implement, please let me know if this pattern is the only one relevant, or if e.g. spaces are allowed between numbers and parentheses, or whether decimal dot can also be inside the parentheses.