Closed oomsveta closed 1 month ago
I also wrote a quick-and-dirty fix:
diff --git a/babel/messages/extract.py b/babel/messages/extract.py
index 48573ee..182a74b 100644
--- a/babel/messages/extract.py
+++ b/babel/messages/extract.py
@@ -502,6 +502,7 @@ def extract_python(
:param options: a dictionary of additional options (optional)
:rtype: ``iterator``
"""
+ last_name_token = ""
funcname = lineno = message_lineno = None
stack_depth = -1
buf = []
@@ -521,6 +522,8 @@ def extract_python(
current_fstring_start = None
for tok, value, (lineno, _), _, _ in tokens:
+ if tok == NAME:
+ last_name_token = value
if stack_depth == -1 and tok == NAME and value in ('def', 'class'):
in_def = True
elif tok == OP and value == '(':
@@ -530,7 +533,8 @@ def extract_python(
in_def = False
continue
if funcname:
- message_lineno = lineno
+ if last_name_token in keywords:
+ message_lineno = lineno
stack_depth += 1
elif in_def and tok == OP and value == ':':
# End of a class definition without parens
diff --git a/tests/messages/test_extract.py b/tests/messages/test_extract.py
index 7d3a05a..852cdea 100644
--- a/tests/messages/test_extract.py
+++ b/tests/messages/test_extract.py
@@ -97,10 +97,10 @@ add_notice(req, ngettext("Bar deleted.",
messages = list(extract.extract_python(buf, ('ngettext', '_'), ['NOTE:'],
{'strip_comment_tags': False}))
- assert messages[0] == (3, 'ngettext', ('Catalog deleted.', 'Catalogs deleted.', None), ['NOTE: This Comment SHOULD Be Extracted'])
+ assert messages[0] == (2, 'ngettext', ('Catalog deleted.', 'Catalogs deleted.', None), ['NOTE: This Comment SHOULD Be Extracted'])
assert messages[1] == (6, '_', 'Locale deleted.', ['NOTE: This Comment SHOULD Be Extracted'])
assert messages[2] == (10, 'ngettext', ('Foo deleted.', 'Foos deleted.', None), ['NOTE: This Comment SHOULD Be Extracted'])
- assert messages[3] == (15, 'ngettext', ('Bar deleted.', 'Bars deleted.', None), ['NOTE: This Comment SHOULD Be Extracted', 'NOTE: And This One Too'])
+ assert messages[3] == (14, 'ngettext', ('Bar deleted.', 'Bars deleted.', None), ['NOTE: This Comment SHOULD Be Extracted', 'NOTE: And This One Too'])
def test_declarations(self):
buf = BytesIO(b"""\
Thanks for the investigation!
Just for reference, GNU's gettext sets the line number to 2:
> xgettext hello.py
> cat messages.po
...
#: line.py:2
msgid "hello"
msgstr ""
I think it makes sense to try to be consistent with xgettext here, i.e. use lineno=2. Would you like to open a PR?
I think it makes sense to try to be consistent with xgettext here, i.e. use lineno=2. @tomasr8
Agreed, and in addition to being consistent with xgettext, it makes more sense to use the number of the line with the actual message. It might be a bit tricky to implement, though: I imagine it'll require pairing each message with its lineno, which would change the current API and break the custom extract methods. What do you think?
Would you like to open a PR?
I don't mind giving it a try, but I'm not sure I'll be able to free up enough time to look into it at the moment
I imagine it'll require pairing each message with its lineno, which would change the current API and break the custom extract methods.
How would this break custom extractors?
Overview
Consider this piece of code:
Extracting it yields the following result:
As you may have noticed, the lineno (the first component of the tuple) is wrong. It should be either 1 or 2 (depending on whether you expect the lineno of gettext or the lineno of the value), but not 3.
This is because the code setting message_lineno is executed every time an opening parenthesis is encountered and funcname is set: https://github.com/python-babel/babel/blob/f91754b01cb9f32b83aeaa80b74ed10b5dfccb6a/babel/messages/extract.py#L526-L533 while funcname is only changed if the name is in keywords (i.e., if the name is one of the gettext names): https://github.com/python-babel/babel/blob/f91754b01cb9f32b83aeaa80b74ed10b5dfccb6a/babel/messages/extract.py#L615-L616
So, in the case where you have a gettext call -setting funcname- with another function call inside its arguments -resulting in an opening parenthesis-, you meet the two conditions to change the value of lineno for something that might be wrong, for example, if you spread the arguments to gettext over multiple lines
Steps to Reproduce
Copy-paste this to your python REPL:
Actual Results
Expected Results
or maybe
depending on whether you expect the lineno of gettext or the lineno of the value
Additional Information
The first and last test cases of
test_comments_with_calls_that_spawn_multiple_lines
should fail, assuming you decide to consider the lineno to be the lineno of the gettext call. It currently fails if you replace the call tolen
in the test cases with something that isn't a function call