jaraco / inflect

Correctly generate plurals, ordinals, indefinite articles; convert numbers to words
https://pypi.org/project/inflect
MIT License
957 stars 107 forks source link

`engine.plural` raises unexpected `IndexError` #172

Open Harmon758 opened 1 year ago

Harmon758 commented 1 year ago

This might be considered invalid input, but engine.plural accepts multiple words. When doing so, it can raise an unexpected IndexError:

>>> import inflect
>>> engine = inflect.engine()
>>> engine.plural("I'm only here for a minute, John.")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "pydantic\decorator.py", line 40, in pydantic.decorator.validate_arguments.validate.wrapper_function
  File "pydantic\decorator.py", line 134, in pydantic.decorator.ValidatedFunction.call
  File "pydantic\decorator.py", line 206, in pydantic.decorator.ValidatedFunction.execute
  File "C:\Program Files\Python39\lib\site-packages\inflect\__init__.py", line 2403, in plural
    plural = self.postprocess(
  File "C:\Program Files\Python39\lib\site-packages\inflect\__init__.py", line 2375, in postprocess
    result[index] = result[index].capitalize()
IndexError: list index out of range

This seems to be because engine.postprocess expects the inflected it's passed to be the same number of words as the orig it's passed, but that's not necessarily the case.

VeenaPulicharla commented 1 year ago

Facing same issue, it's working is strange

Screenshot 2023-01-18 at 4 46 54 PM
jaraco commented 1 year ago

That seems like invalid input to me. What would you expect when pluralizing a phrase?

VeenaPulicharla commented 1 year ago

@jaraco, I wouldn't expect it to break and could you please explain why it's is not raising the error in the first run and raising in the second run, there is no much difference in the input except spaces

jaraco commented 1 year ago

could you please explain why it's is not raising the error in the first run and raising in the second run, there is no much difference in the input except spaces

I'm not familiar enough with the implementation to know what's expected here. I would need to trace the inputs through the code and try to infer what the code is trying to do. Harmon has determined that sometimes the length of the words being processed is a mismatch before and after postprocess. There may be a bug there, but I'm not confident that this use case should be supported at all.

I wouldn't expect it to break

I think I would expect it to break, if it can't provide a reasonable response. Perhaps there should be a check and if the input is something unexpected (like pipes or long phrases), it should just raise a ValueError early.

Regardless, this project is community-supported, so I'll leave this one for someone to investigate and propose a solution.