ruby-gettext / gettext

Gettext gem is a pure Ruby Localization(L10n) library and tool which is modeled after the GNU gettext package.
https://ruby-gettext.github.io/
69 stars 28 forks source link

Isn't `Np_` missing? #108

Open akimd opened 8 months ago

akimd commented 8 months ago

Hi,

I'm using p_ translations. But I can't find the equivalent for N_ to tag-but-not-translate contextual messages.

Am I missing something?

Sure enough I can define one that just returns the msgid. But the parsers don't look at it.

Thanks in advance!

kou commented 8 months ago

OK. Let's add it!

This may work:

diff --git a/lib/gettext.rb b/lib/gettext.rb
index f8b8563..9d8214c 100644
--- a/lib/gettext.rb
+++ b/lib/gettext.rb
@@ -243,6 +243,11 @@ module GetText
     [msgid, msgid_plural]
   end

+  # TODO
+  def Np_(msgctxt, msgid)
+    [msgctxt, msgid]
+  end
+
   # Sets charset(String) such as "euc-jp", "sjis", "CP932", "utf-8", ...
   # You shouldn't use this in your own Libraries.
   # * charset: an output_charset
diff --git a/lib/gettext/tools/parser/ruby.rb b/lib/gettext/tools/parser/ruby.rb
index 8d563a8..0b3e472 100644
--- a/lib/gettext/tools/parser/ruby.rb
+++ b/lib/gettext/tools/parser/ruby.rb
@@ -21,7 +21,7 @@ module GetText
     class POExtractor < Ripper::Filter
       ID = ["gettext", "_", "N_", "sgettext", "s_"]
       PLURAL_ID = ["ngettext", "n_", "Nn_", "ns_", "nsgettext"]
-      MSGCTXT_ID = ["pgettext", "p_"]
+      MSGCTXT_ID = ["pgettext", "p_", "Np_"]
       MSGCTXT_PLURAL_ID = ["npgettext", "np_"]

       attr_accessor :use_comment
@@ -95,7 +95,7 @@ module GetText

       def process_on_const(token, po)
         case token
-        when "N_", "Nn_"
+        when "N_", "Nn_", "Np_"
           # TODO: Check the next token is :on_lparen
           process_on_ident(token, po)
         else

And we need to add tests for it.

akimd commented 8 months ago

Hi!

Thanks for the fast reaction!

Wrt Np_, I agree it's actually a bit unclear what it should do. I wrote "just returns the msgid", which is wrong of course. You chose to return the couple, which is exact but does not leave a string. I was thinking about "#{msgctxt}\004#{msgid}" to stay consistent wrt types.

But then the question is how to resolve a Np_. Your proposal translates it via p_, mine would require _.

Cheers!

kou commented 8 months ago

Ah, you're right. We should "just return the msgid" because p_ returns the msgid when the msgid isn't localized:

def Np_(msgctxt, msgid)
   msgid
end
akimd commented 8 months ago

Hi,

I took some more time to think about that.

When using s_ (not p_), I definitely use N_ upstream. So it looks like

str = N_("foo|bar")
# str == "foo|bar", *not* "bar"
...
s_(str)

And it is absolutely critical that the context is preserved. I think it should work just the same for p_. So I would expect this to work properly:

str = Np_("foo", "bar")
...
p_(str)

Yes, at the end I'm using p_ with a single argument (explained below), because:

  1. I'm convinced Np_ should return a string
  2. just like we do with s_ it must contain the context
  3. so we should simply pass the result of Np_ to the translating function (just as we pass the result of N_ to s_ or _) I disagree with what I initially stated: the key property of N_ is not that it "returns the msgid", but that it returns the key in the MO file. The key for simple messages is simply msgid, but for contextualized message, the key is composite: msgctxt and msgid.
  4. as a consequence Np_ should return a string with both msgctxt and msgid. Given that contextualized messages are internally converted into "#{msgctxt}\004#{msgid}", I strongly believe that Np_ should just do that.
  5. we need a function to handle that contextualized string There are then two cases that must work:
    • the message is translated and part of the catalogue In that case, simply using _(s) would work
    • it is not In that case, _(s) would show the context, which is wrong.

So I conclude that we need another function, that behaves like this:

def pp_(s) = p_(*s.split("\004", 2))

Instead of splitting, if we can query the catalogue, it could be something like

def pp_(s) = get_msgstr(s) || s.split("\004", 2)[1]

where get_msgstr is a hypothetical function that (i) returns the translation if it exists, (ii) returns nil otherwise. Or use translate_singular_message.

Finally, instead of naming this function pp_, we could use p_, but add a new behavior when it's called with a single argument.

Do I make sense?

Cheers!

kou commented 7 months ago

Thanks for considering API!

How about using [msgctxt, msgid] instead of "#{msgctxt}\004#{msgid}"?

def Np_(msgctxt, msgid)
   [msgctxt, msgid]
end

def p_(msgctxt, msgid=nil)
  msgctxt, msgid = msgctxt if msgctxt.is_a?(Array) and msgid.nil?
  TextDomainManager.translate_singular_message(self, msgid, seperator)
end

We can write the following with this API:

message = Np_("foo", "bar")
...
p_(message)