collective / i18ndude

i18ndude performs various tasks related to ZPT's, Python Scripts and i18n.
https://pypi.org/project/i18ndude
4 stars 9 forks source link

Use lxml in find-untranslated #30

Closed mauritsvanrees closed 8 years ago

mauritsvanrees commented 8 years ago

The core still uses xml.sax, but lxml is used to cleanup the content so xml.sax has a better chance of not crashing on perceived syntax errors.

mauritsvanrees commented 8 years ago

@ale-rt @vincentfretin @gforcada Comments and testing welcome for this pull request and number #31. Seems fine when using this in Plone core, solving several problems.

ale-rt commented 8 years ago

I did this on my core dev buildout

[ale@kenobi src]$ i18ndude find-untranslated  `find . -name "*.pt"` > /dev/null
Traceback (most recent call last):
  File "/opt/utils/i18ndude/bin/i18ndude", line 9, in <module>
    load_entry_point('i18ndude==3.4.6.dev0', 'console_scripts', 'i18ndude')()
  File "/opt/utils/i18ndude/local/lib/python2.7/site-packages/i18ndude/script.py", line 691, in main
    errors = arguments.func(arguments)
  File "/opt/utils/i18ndude/local/lib/python2.7/site-packages/i18ndude/script.py", line 164, in find_untranslated
    content = StringIO(content.strip())
AttributeError: 'NoneType' object has no attribute 'strip'

It turns out that i18ndude fails parsing this file:

[ale@kenobi src]$ i18ndude find-untranslated ./Products.CMFEditions/Products/CMFEditions/www/talesModifierAddForm.pt
Traceback (most recent call last):
  File "/opt/utils/i18ndude/bin/i18ndude", line 9, in <module>
    load_entry_point('i18ndude==3.4.6.dev0', 'console_scripts', 'i18ndude')()
  File "/opt/utils/i18ndude/local/lib/python2.7/site-packages/i18ndude/script.py", line 691, in main
    errors = arguments.func(arguments)
  File "/opt/utils/i18ndude/local/lib/python2.7/site-packages/i18ndude/script.py", line 164, in find_untranslated
    content = StringIO(content.strip())
AttributeError: 'NoneType' object has no attribute 'strip'

The variable content in this line:

For sure we can protect i18ndude from this error, but it turns out that the real problem is that the file is empty...:

[ale@kenobi src]$ du ./Products.CMFEditions/Products/CMFEditions/www/talesModifierAddForm.pt
0       ./Products.CMFEditions/Products/CMFEditions/www/talesModifierAddForm.pt

See:

mauritsvanrees commented 8 years ago

I liberally reworked, rebased and squashed the pull request. We try parsing the file in various ways now. The common.prepare_xml function that I initially removed, is back again as a last resort.

When I try it in coredev I now get only three fatal errors, one ironically in i18ndude itself:

$ ./src/i18ndude/bin/i18ndude find-untranslated -s src | grep -i Fatal
Fatal error in document src/i18ndude/src/i18ndude/tests/input/test1.pt
Fatal error in document src/plone.app.contenttypes/plone/app/contenttypes/browser/templates/image_view_fullscreen.pt
Fatal error in document src/Products.CMFPlone/Products/CMFPlone/browser/templates/plone-addsite.pt

Zooming in for good measure:

$ ./src/i18ndude/bin/i18ndude find-untranslated src/i18ndude/src/i18ndude/tests/input/test1.pt
src/i18ndude/src/i18ndude/tests/input/test1.pt:59:15:
-FATAL- - ERRORs found trying to parse document in various ways:
<unknown>:2:43: duplicate attribute
<unknown>:59:15: not well-formed (invalid token)
Namespace prefix metal for use-macro on html is not defined, line 5, column 28

===============================================================================
$ ./src/i18ndude/bin/i18ndude find-untranslated src/plone.app.contenttypes/plone/app/contenttypes/browser/templates/image_view_fullscreen.pt
src/plone.app.contenttypes/plone/app/contenttypes/browser/templates/image_view_fullscreen.pt:44:2:
-FATAL- - ERRORs found trying to parse document in various ways:
<unknown>:2:43: duplicate attribute
<unknown>:44:2: mismatched tag
Opening and ending tag mismatch: meta line 10 and head, line 44, column 8

===============================================================================
$ ./src/i18ndude/bin/i18ndude find-untranslated src/Products.CMFPlone/Products/CMFPlone/browser/templates/plone-addsite.pt
src/Products.CMFPlone/Products/CMFPlone/browser/templates/plone-addsite.pt:24:16:
-ERROR- - alt attribute of <img> lacks i18n:attributes
-------------------------------------------------------------------------------
src/Products.CMFPlone/Products/CMFPlone/browser/templates/plone-addsite.pt:212:4:
-FATAL- - ERRORs found trying to parse document in various ways:
<unknown>:2:43: duplicate attribute
<unknown>:212:4: mismatched tag
Opening and ending tag mismatch: div line 23 and body, line 212, column 10

===============================================================================

Note that the reported errors are gathered from all the tries

Fixed the first one in our own template by using html (5) as doctype in 060e3a0ba5c4458559d5b4423076e9ba9093acb0 Fixed the second one in the same way in https://github.com/plone/plone.app.contenttypes/commit/ceb3ba992413797f4826f22312add1fccf4ecd36 Fixed the third one by fixing the missing div tag in https://github.com/plone/Products.CMFPlone/commit/00c68b21c5ecd5dd55c4f1f52fb4b12e8bcea98f

So: no fatal errors anymore! :-)

vincentfretin commented 8 years ago

Thanks @mauritsvanrees !

gforcada commented 8 years ago

:+1: