mquinson / po4a

Maintain the translations of your documentation with ease (PO for anything)
http://po4a.org/
GNU General Public License v2.0
127 stars 62 forks source link

Add quotes around spaced values when parsing config options #455

Closed gemmaro closed 7 months ago

gemmaro commented 9 months ago

Hello,

This currently adds a test case about an option value with spaces, and I haven't found a fix yet. The case fails with 0.70-alpha but not with 0.69.

Suppose the following config file is given:

[options] --package-name "Hello world" \
  --copyright-holder "Goodbye Mars"

[po4a_langs] ja
[po4a_paths] man.pot ja:man.ja.po
[type:man] man.1 ja:man.ja.1

Then running po4a with --debug flag says that it parses the options as follows:

Options for man.1 in ja: --package-name Hello world --copyright-holder Goodbye Mars

... while the expected option parsing is like this, as in v0.69:

Options for man.1 in ja: --package-name 'Hello world' --copyright-holder 'Goodbye Mars'

The output of the test looks like this:

./Build test --verbose --test_files t/cfg-option.t ```text t/cfg-option.t .. # Subtest: space in option value (dstdir) ok 1 - Change directory to cfg/space-in-option-value ok 2 - Executing po4a ok 3 - Command: perl BUILDPATH/po4a -f BUILDPATH/t/cfg/space-in-option-value/po4a.conf --destdir BUILDPATH/t/tmp/cfg/space-in-option-value --verbose > BUILDPATH/t/tmp/cfg/space-in-option-value/output 2>&1 ok 4 - Change directory back to /home/gemmaro/src/po4a/t ok 5 - Comparing output of po4a ok 6 - Command: sed -e 's|BUILDPATH/t/||' -e 's|tmp/cfg/space-in-option-value/||' -e 's|cfg/space-in-option-value/||' tmp/cfg/space-in-option-value/output | diff -u cfg/space-in-option-value/_output - 2>&1 > tmp/cfg/space-in-option-value/diff_output # Expecting 2 output files: man.pot:man.ja.po # Seen file tmp/cfg/space-in-option-value/diff_output # Seen file tmp/cfg/space-in-option-value/man.ja.1 not ok 7 - Unexpected file 'man.ja.1' # Failed test 'Unexpected file 'man.ja.1'' # at t/Testhelper.pm line 311. # Seen file tmp/cfg/space-in-option-value/man.ja.po # Seen file tmp/cfg/space-in-option-value/man.pot # Seen file tmp/cfg/space-in-option-value/output ok 8 - diff -u --strip-trailing-cr -I'^"Project-Id-Version:' -I'^"POT-Creation-Date:' -I'^"PO-Revision-Date:' -I'^\# [^[:blank:]]* translations for ' -I'^\# Language [^[:blank:]]* translations for ' -I'Copyright (C) 20.. Free Software Foundation, Inc.' -I'^\# This file is distributed under the same license as the' -I'^\# Automatically generated, 20...' -I\#: "cfg/space-in-option-value/man.ja.po" "tmp/cfg/space-in-option-value/man.ja.po" 2>&1 > tmp/cfg/space-in-option-value/_cmd_output not ok 9 - Provided command (retcode: 256) # Failed test 'Provided command (retcode: 256)' # at t/Testhelper.pm line 141. # Expected retcode: 0 # FAILED command: diff -u --strip-trailing-cr -I'^"Project-Id-Version:' -I'^"POT-Creation-Date:' -I'^"PO-Revision-Date:' -I'^# [^[:blank:]]* translations for ' -I'^# Language [^[:blank:]]* translations for ' -I'Copyright (C) 20.. Free Software Foundation, Inc.' -I'^# This file is distributed under the same license as the' -I'^# Automatically generated, 20...' -I#: "cfg/space-in-option-value/man.pot" "tmp/cfg/space-in-option-value/man.pot" 2>&1 > tmp/cfg/space-in-option-value/_cmd_output # Command output: # | --- cfg/space-in-option-value/man.pot 2024-01-05 19:03:28.461584428 +0900 # | +++ tmp/cfg/space-in-option-value/man.pot 2024-01-05 19:04:14.331987315 +0900 # | @@ -1,6 +1,6 @@ # | # SOME DESCRIPTIVE TITLE # | -# Copyright (C) YEAR Goodbye Mars # | -# This file is distributed under the same license as the Hello world package. # | +# Copyright (C) YEAR Goodbye # | +# This file is distributed under the same license as the Hello package. # | # FIRST AUTHOR , YEAR. # | # # | #, fuzzy # (end of output) # Produced file tmp/cfg/space-in-option-value/_cmd_output: # --- cfg/space-in-option-value/man.pot 2024-01-05 19:03:28.461584428 +0900 # +++ tmp/cfg/space-in-option-value/man.pot 2024-01-05 19:04:14.331987315 +0900 # @@ -1,6 +1,6 @@ # # SOME DESCRIPTIVE TITLE # -# Copyright (C) YEAR Goodbye Mars # -# This file is distributed under the same license as the Hello world package. # +# Copyright (C) YEAR Goodbye # +# This file is distributed under the same license as the Hello package. # # FIRST AUTHOR , YEAR. # # # #, fuzzy # (end of tmp/cfg/space-in-option-value/_cmd_output) # ------------------------------------------------------------- # Produced file tmp/cfg/space-in-option-value/man.ja.1: # .\"******************************************************************* # .\" # .\" This file was generated with po4a. Translate the source file. # .\" # .\"******************************************************************* # .TH 検査 1 # (end of tmp/cfg/space-in-option-value/man.ja.1) # ------------------------------------------------------------- # Produced file tmp/cfg/space-in-option-value/man.ja.po: # # Japanese translations for Hello world package # # Copyright (C) 2024 Goodbye Mars # # This file is distributed under the same license as the Hello world package. # # Automatically generated, 2024. # # # msgid "" # msgstr "" # "Project-Id-Version: Hello VERSION\n" # "POT-Creation-Date: 2024-01-05 19:04+0900\n" # "PO-Revision-Date: 2024-01-05 08:54+0900\n" # "Last-Translator: Automatically generated\n" # "Language-Team: none\n" # "Language: ja\n" # "MIME-Version: 1.0\n" # "Content-Type: text/plain; charset=UTF-8\n" # "Content-Transfer-Encoding: 8bit\n" # "Plural-Forms: nplurals=1; plural=0;\n" # # #. type: TH # #: man.1:1 # #, no-wrap # msgid "test" # msgstr "検査" # (end of tmp/cfg/space-in-option-value/man.ja.po) # ------------------------------------------------------------- # Produced file tmp/cfg/space-in-option-value/man.pot: # # SOME DESCRIPTIVE TITLE # # Copyright (C) YEAR Goodbye # # This file is distributed under the same license as the Hello package. # # FIRST AUTHOR , YEAR. # # # #, fuzzy # msgid "" # msgstr "" # "Project-Id-Version: Hello VERSION\n" # "POT-Creation-Date: 2024-01-05 19:04+0900\n" # "PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n" # "Last-Translator: FULL NAME \n" # "Language-Team: LANGUAGE \n" # "Language: \n" # "MIME-Version: 1.0\n" # "Content-Type: text/plain; charset=UTF-8\n" # "Content-Transfer-Encoding: 8bit\n" # # #. type: TH # #: man.1:1 # #, no-wrap # msgid "test" # msgstr "" # (end of tmp/cfg/space-in-option-value/man.pot) # ------------------------------------------------------------- # Produced file tmp/cfg/space-in-option-value/output: # Updating man.pot: (1 entries) # Updating man.ja.po: 1 translated message. # man.ja.1 is 100% translated (1 strings). # (end of tmp/cfg/space-in-option-value/output) # ------------------------------------------------------------- 1..9 # Looks like you failed 2 tests of 9. not ok 1 - space in option value (dstdir) # Failed test 'space in option value (dstdir)' # at t/Testhelper.pm line 563. 1..1 # Looks like you failed 1 test of 1. Dubious, test returned 1 (wstat 256, 0x100) Failed 1/1 subtests Test Summary Report ------------------- t/cfg-option.t (Wstat: 256 (exited 1) Tests: 1 Failed: 1) Failed test: 1 Non-zero exit status: 1 Files=1, Tests=1, 0 wallclock secs ( 0.02 usr 0.01 sys + 0.31 cusr 0.07 csys = 0.41 CPU) Result: FAIL Failed 1/1 test programs. 1/1 subtests failed. ```

This may be related to https://github.com/mquinson/po4a-website/issues/15.

Thank you,

mquinson commented 9 months ago

Hello @gemmaro, you want me to deal with this PR as if it were a bug report, or you are currently working on it?

gemmaro commented 9 months ago

Hello @mquinson, I'm working on this, but I would appreciate it if you would take over when I'm away too much.

gemmaro commented 9 months ago

Here are some notes on what I have learned so far.

Each line of the po4a configuration file is split by split_line. This splitting correctly interprets quotes correctly. For example, "--package 'Hello world'" is split into ("--package-name", "Hello world").

The parse_config_options function will combine option strings. For example, ("--package-name", "Hello world") becomes "--package-name Hello world".

Then, at the line with get_options( split_opts( $document{''}{"options"}{"global"} ), @ORIGINAL_ARGV ), the split_opts function splits into ARGV and the get_options function interpret them with the GetOptions function from the Getopt::Long module. Unfortunately, for example, "--package-name Hello world" would result in the "--package-name"'s argument being "Hello" only.

This is why the problem occurs. I'm trying to figure out how to fix it...

There are two things I can think of right now. One is to prevent the split_line function from removing the quotes. The other is to change the definition of the parse_config_options function so that it is stored as an array and the appropriate ARGV are passed to the get_options function, instead of being concatenated into a string.

gemmaro commented 8 months ago

I added an update and is now ready for review.

Regarding the previous comment, I chose a similar approach to the former one (but not split_line). In the parse_config_options function, the values that isn't prefixed with "opt:" and contain spaces are enclosed in quotes.

There is still a limitation with values starting with "opt:", but I think it is unavoidable because single and double quotes cannot be mixed at the moment.

mquinson commented 7 months ago

Hello @gemmaro, sorry, I forgot about this PR before taking a bite at the other bug. It is now possible to mix simple and double quotes. Does it change something to your code?

mquinson commented 7 months ago

Actually, I merge it because it's good already. If you want to do something for the prefixed chunks now that we can mix simple and double quotes, just do another PR. I cannot do that because I'm not sure I fully understand the issue you are refering to. But it's late already, maybe I didn't try hard enough.

gemmaro commented 7 months ago

Hello @mquinson. Thank you for merging this and dealing with the mixed quotes. I'll try the latest version and report back if anything happens.