mquinson / po4a

Maintain the translations of your documentation with ease (PO for anything)
http://po4a.org/
GNU General Public License v2.0
123 stars 61 forks source link

Incorrect handling of asciidoc tables with empty elements #343

Closed mquinson closed 2 years ago

mquinson commented 2 years ago

Hello,

I've got a bug report from Debian which reads as follows. Please see https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1002303 for the original bug report, which I inline below.

[Petter Reinholdtsen] I ran into something while setting up po4a in the LinuxCNC project, which I suspect is a bug in po4a handling of AsciiDoc tables.

This set of test files demonstrate the problem, where the translated table format is messed up (lacking newlines). Note, adding '.' at the end of the lines with directions avoid the problem.

% cat table.adoc
= table test

.HAL Option Names
[width="100%", cols="<3s,4*<"]
|===========================================================
|Pin and Parameter Types: |HAL_BIT |HAL_FLOAT |HAL_S32 |HAL_U32
|Pin Directions:          |HAL_IN  |HAL_OUT   |HAL_IO  |
|Parameter Directions:    |HAL_RO  |HAL_RW    |        |
|===========================================================
% cat po4a.cfg
[po4a_langs] fr
[po4a_paths] table.pot $lang:$lang.po

[po4a_alias:AsciiDoc_def] AsciiDoc opt:"--keep 0 --option 'entry=lang' --option 'tablecells'"

[type: AsciiDoc_def] table.adoc $lang:$lang/table.adoc
% po4a po4a.cfg; cat fr/table.adoc 
= table test

.HAL Option Names
[width="100%", cols="<3s,4*<"]
|===========================================================
|Pin and Parameter Types: |HAL_BIT |HAL_FLOAT |HAL_S32 |HAL_U32
|Pin Directions:          |HAL_IN  |HAL_OUT   |HAL_IO  ||Parameter Directions:    |HAL_RO  |HAL_RW    |        ||===========================================================
%

Note, a workaround for the issue is to place a space character after the last '|' character on the table rows with empty fields in the right column.

jnavila commented 2 years ago

Oh, yes... Empty cells are quite unexpected.

mquinson commented 2 years ago

Follow up from Petter on the Debian bug:

I came across another problematic table. Seem to me the AsciiDoc parser interpret zero as empty in the table cells. This table:

.Look Up Table
[width="50%",cols="6*^",options="header"]
|====================================
|Bit 4|Bit 3|Bit 2|Bit 1|Bit 0|Output
|0|0|0|0|0|0
|0|0|0|0|1|1
|0|0|0|1|0|0
|0|0|0|1|1|1
|====================================

it is 'translated' into this table:

.Look Up Table
[width="50%", cols="6*^", options="header"]
|====================================
|Bit 4|Bit 3|Bit 2|Bit 1|Bit 0|Output
|||||||||||1|1
||||1||||||1|1|1
|====================================

As you can see, not quite what was intended. I tried to add space in front of the 0 elements, but it did not make a difference. Is it the perl evaluation of 'false' that messes things up?

mquinson commented 2 years ago

Yet another follow-up:

I found a third mishandled table style in the LinuxCNC documentation. When po4a is given this table from remap.adoc:

# remap.adoc
[format="csv",width="60%",cols="2"]
[frame="topbot",grid="none"]
[options="header"]
|===
iocontrol.0 pin  ,motion pin    
tool-prepare,digital-out-00
tool-prepared,digital-in-00  
tool-change,digital-out-01
tool-changed,digital-in-01  
tool-prep-number,analog-out-00  
tool-prep-pocket,analog-out-01  
tool-number,analog-out-02  
|===

we end up with this translated table:

[format="csv", width="60%", cols="2"]
[frame="topbot", grid="none"]
[options="header"]
|===

iocontrol.0 pin  ,motion pin    
tool-prepare,digital-out-00
tool-prepared,digital-in-00  
tool-change,digital-out-01
tool-changed,digital-in-01  
tool-prep-number,analog-out-00  
tool-prep-pocket,analog-out-01  
tool-number,analog-out-02  
|===

Sadly the extra newline after |=== cause the asciidoc tool to reject the translated table.

jnavila commented 2 years ago

Note, a workaround for the issue is to place a space character after the last '|' character on the table rows with empty fields in the right column.

This is due to these lines: https://github.com/mquinson/po4a/blob/7fe0be4fbeac933c4d3e7167338384ed4b35589f/lib/Locale/Po4a/AsciiDoc.pm#L449-L452

Empty string is equal to no strings, which is the logic taken in general in this part of code. Is your work around doable in the sources?

TBH and not minding my own business, using a table for formatting purpose is usually considered bad practice and harmful. These are truly enumeration lists.

jnavila commented 2 years ago

[format="csv",width="60%",cols="2"]

CSV style tables are not supported. Table cells support is very crude and will likely not support other styles than the common PSV style.

jnavila commented 2 years ago

344 also fixes cells containing "0"... Another idiosyncrasy of Perl... :cry:

mquinson commented 2 years ago

Can we maybe raise an error message when presented with a CSV table? I think that it's better than silently failing.

jnavila commented 2 years ago

True, but there's nothing in the present code to discriminate a CSV table. This would require parsing the table header, and until now we managed to let asciidoc styling stuff out of the way, which kept things simpler.

What would be reachable though is the ability to flip tablecells mode on and off with //po4a: comments.

jnavila commented 2 years ago

Luckily, the option name for deciding the format is format and it is used for images and tables with values that cannot be mixed. So, a simple regex match may work...