hadrielk / string-interpolation

String interpolation proposal paper
3 stars 0 forks source link

Numbered expansion-fields. #6

Open BengtGustafsson opened 1 year ago

BengtGustafsson commented 1 year ago

Even if expressions are in inserts there is advantages to allowing numbered fields to avoid repeating the same expression. However, this can of course be worked around by storing the result in a temporary variable before doing the actual formatting or by using the numbered field facility we already have in std::format.

Note that this has nothing to do with translation. It is typical that in a translated string it is more logical to maybe duplicate a field value which was not duplicated in the original text, or the other way around. This can all be done without allowing numbered arguments in the untranslated f-string, if you just abide by the regular rules of std::format, as the expressions can be accessed by their ordinals in the original string, if you just stick to giving all fields numbers in the translated string. std::format will not know that there was an unnumbered original string when the time comes to do the formatting (which surely would use std::runtime_format).

hadrielk commented 1 year ago

I think the use-case for index-numbered fields is two-fold:

  1. The "traditional" std::format-style index-number usage, such as this:
// 'val' displayed in both native and hex formats
std::format("value={0} ({0:#06x})", val);
  1. International-translation usage such as this:
std::format(gettext("The {0} is {1:d}"), name, value);

Where the gettext() might return a format-string with a different order due to the language, such as "{1:d} ist der {0}".

Note: In C, the above would use percent-encoding and the posix extension for printf-style formatting of "%n$", to indicate index-numbered arguments.


So I think, though am not sure, that if we just supported index-numbering using @ like this, we could make it work:

std::format(X"value={val} ({@0:#06x})");

std::format(T"The {name@0} is {value@1:d}"));
BengtGustafsson commented 1 year ago

Quote from cppreference: The arg-ids in a format string must all be present or all be omitted. Mixing manual and automatic indexing is an error.

But maybe this works too in fmt. The committee may have thought it was error prone to mix, or didn't see your first use case. But in earnest I don't think the first use case is that important.

This is of course a limitation that can be removed, as it only makes more format strings legal.

hadrielk commented 1 year ago

Oops, I screwed up that first example - I didn't mean to put val in the braces, I meant to put 0. Fixed now.

BengtGustafsson commented 1 year ago

As for your last example I don't even think we need to do anything to make this work (except removing the expressions in xgettext or its helpers: The translator can count the insert in the original string and add numbers to the translation iff a reordering is to be done.

I suspect that this could be the reason that extra arguments are allowed: For instance an US program may provide a weight in both pounds (arg 0) and kilograms (arg 1) and all the translators except the one in Myanmar will have to put {1} in the translated string to get the weight in SI units. The task of informing translators of what extra arguments are available in each and every string xgettext extracts is left as an excersice by the committee, I'd assume. (i.e. I don't think it is a realistic way to solve this type of problem, Python definitely has an edge there).