edgewall / genshi

Python toolkit for generation of output for the web
http://genshi.edgewall.org
Other
86 stars 35 forks source link

Skip empty attributes to translate. #38

Closed jun66j5 closed 3 years ago

jun66j5 commented 3 years ago

I encountered that Genshi generates placeholder with *.po header for empty placeholder like this:

<input type="text" placeholder="" />

to:

<input type="text" placeholder="Project-Id-Version: ..." />

Another example of empty alt attribute in https://trac.edgewall.org/ticket/10108#comment:2.

I think we should skip empty attributes to translate.

FelixSchwarz commented 3 years ago

I think we should skip empty attributes to translate.

Sounds sensible to me.

FelixSchwarz commented 3 years ago

I briefly looked at the code (though I don't pretend to fully understand the translation mechanics) and it looks to me as if your patch just prevents the translated value to be inserted in the final HTML, right? So the empty string will be extracted but the translation won't be inserted?

My idea was that empty strings might be skipped during extraction (not sure if that is feasible in Genshi currently).

jun66j5 commented 3 years ago

I briefly looked at the code (though I don't pretend to fully understand the translation mechanics) and it looks to me as if your patch just prevents the translated value to be inserted in the final HTML, right?

Yes.

So the empty string will be extracted but the translation won't be inserted?

Yes. In Genshi 0.7.5, the empty string is not extracted however it is translated. After the patch, the empty string is not extracted nor translated.

My idea was that empty strings might be skipped during extraction (not sure if that is feasible in Genshi currently).

I temporarily added the following test, but it passes without my patch:

+    def test_extract_included_empty_attribute_text(self):
+        tmpl = MarkupTemplate(u"""<html>
+          <span title="">...</span>
+        </html>""")
+        translator = Translator()
+        messages = list(translator.extract(tmpl.stream))
+        self.assertEqual([], messages)
+

Investigating the source, Genshi (0.7.5) skips empty attributes at https://github.com/edgewall/genshi/blob/0.7.5/genshi/filters/i18n.py#L918-L921.

Also, I noticed spaces in string are stripped during the extraction. I think we should strip spaces in string during the translation, too.

jun66j5 commented 3 years ago

Root cause is that getext() function translates empty string to *.po header information. This behavior is by gettext's design. See https://stackoverflow.com/a/7659746.

For example, alt attribute of img element is often leave with empty string. Genshi with i18n feature translates such an empty attribute to with .po header information. The behavior confuses a user and it assume the user doesn't want alt attribute with .po header information (at least to me).

Workaround is preventing the translation by using expression in attribute, e.g. alt="${''}".

hodgestar commented 3 years ago

@jun66j5 Thank you for the patch!