python-babel / babel

The official repository for Babel, the Python Internationalization Library
http://babel.pocoo.org/
BSD 3-Clause "New" or "Revised" License
1.29k stars 432 forks source link

Babel can't parse PO files created by Django (blank Language in header) #1087

Open bemoody opened 1 month ago

bemoody commented 1 month ago

Django's makemessages command (Django==5.0.6) can be used to extract messages and generate PO files.

A PO file generated by this command looks like this:

# SOME DESCRIPTIVE TITLE.
# Copyright (C) YEAR THE PACKAGE'S COPYRIGHT HOLDER
# This file is distributed under the same license as the PACKAGE package.
# FIRST AUTHOR <EMAIL@ADDRESS>, YEAR.
#
#, fuzzy
msgid ""
msgstr ""
"Project-Id-Version: PACKAGE VERSION\n"
"Report-Msgid-Bugs-To: \n"
"POT-Creation-Date: 2024-05-24 12:02-0400\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
"Language-Team: LANGUAGE <LL@li.org>\n"
"Language: \n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"
"Plural-Forms: nplurals=2; plural=(n != 1);\n"
#: example/views.py:5
msgid "Hello world"
msgstr ""

However, trying to parse this file using Babel (Babel==2.15.0), with babel.messages.pofile.read_po, gives the error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/benjamin/ve1/lib/python3.11/site-packages/babel/messages/pofile.py", line 387, in read_po
    parser.parse(fileobj)
  File "/home/benjamin/ve1/lib/python3.11/site-packages/babel/messages/pofile.py", line 310, in parse
    self._process_comment(line)
  File "/home/benjamin/ve1/lib/python3.11/site-packages/babel/messages/pofile.py", line 269, in _process_comment
    self._finish_current_message()
  File "/home/benjamin/ve1/lib/python3.11/site-packages/babel/messages/pofile.py", line 206, in _finish_current_message
    self._add_message()
  File "/home/benjamin/ve1/lib/python3.11/site-packages/babel/messages/pofile.py", line 200, in _add_message
    self.catalog[msgid] = message
    ~~~~~~~~~~~~^^^^^^^
  File "/home/benjamin/ve1/lib/python3.11/site-packages/babel/messages/catalog.py", line 682, in __setitem__
    self.mime_headers = message_from_string(message.string).items()
    ^^^^^^^^^^^^^^^^^
  File "/home/benjamin/ve1/lib/python3.11/site-packages/babel/messages/catalog.py", line 482, in _set_mime_headers
    self._set_locale(value)
  File "/home/benjamin/ve1/lib/python3.11/site-packages/babel/messages/catalog.py", line 365, in _set_locale
    self._locale = Locale.parse(locale)
                   ^^^^^^^^^^^^^^^^^^^^
  File "/home/benjamin/ve1/lib/python3.11/site-packages/babel/core.py", line 330, in parse
    parts = parse_locale(identifier, sep=sep)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/benjamin/ve1/lib/python3.11/site-packages/babel/core.py", line 1242, in parse_locale
    raise ValueError(f"expected only letters, got {lang!r}")
ValueError: expected only letters, got ''

Adding abort_invalid=False to read_po doesn't help.

The reason for this is the line "Language: \n".

The same error occurs when calling babel.messages.mofile.read_mo on the compiled version of this PO file.

The error doesn't occur if you remove the Language header completely.

Django should perhaps avoid generating junk boilerplate like this. But Babel should be able to cope with PO or MO files containing invalid MIME headers.