jaraco / inflect

Correctly generate plurals, ordinals, indefinite articles; convert numbers to words
https://pypi.org/project/inflect
MIT License
971 stars 106 forks source link

compare() fails on empty strings #157

Closed Julian-O closed 2 years ago

Julian-O commented 2 years ago

The code:

result = inflect.engine().compare("dice", "")

raises an IndexError exception. It should return False.

The code:

result = inflect.engine().compare("dice", None)

raises a TypeError exception. Arguably, it should return False, but maybe the current result is acceptable.

Note that:

result = inflect.engine().compare("", "")

correctly returns 'eq'

jaraco commented 2 years ago

Thanks for the report. I think I agree that the first case should return False. I'd welcome a test and patch to fix.

jaraco commented 2 years ago

On second thought, I'm not certain this report describes a bug. If compare expects to be passed words or phrases, the empty string is arguably not a valid word or phrase and it could be the responsibility of the caller to ensure that inputs are valid. Can you make an argument why an empty string is a meaningful input for compare()?

Julian-O commented 2 years ago

Can you make an argument why an empty string is a meaningful input for compare()?

One argument is a theoretical argument about good Software Engineering practices:

The doc string for compare() explains its return values are "eq", "p:s", "s:p", "p:p" or otherwise, False.

A TypeError is none of those; this is a bug where the software is not meeting the contract in its interface.

If you want to raise a TypeError, then you should specify when it is going to be raised. If you want to say "only valid words or phrases" are acceptable, then you should specify what that term means.

Another argument is to provide a Real-Life Use Case where this bit me.

[This isn't the exact use case, but illustrative.] I am comparing quiz answers given by a student to the official answers given by a teacher. If the question asks "What is the main ingredient in an omelette?", and the official answer is "Eggs", I want the student to be rewarded for typing "egg" or "eggs". If the student leaves the answer empty, it is a false match, not a reason to crash the software (which is what happened during Alpha testing)

Through absolutely no fault of yours, I have recently had too many dealings with OSS maintainers who have taken a nerdview approach to their interfaces, and have pushed back on improvements for making their tools better for others. I am trying to both be helpful, and at the same time, politely as possible, avoid being dragged in.

I left this bug report as an observation that might be helpful for making the tool better for others. On the other hand, I have added a check for None at my end; things are working for me now. I don't need this bug to be fixed for me.

jaraco commented 2 years ago

Thanks for the explanation. That really helps understand (and document) the motivation for any change. I meant it not as a challenge, but as a way to align our expectations.

The doc string for compare() explains its return values...

True, but it's not common to document all exceptions that a function might raise. Still, a TypeError isn't a great experience for the empty string. If an empty string is disallowed, it should raise a ValueError or maybe another exception with more context about the disallowed input. For the None value, a TypeError seems like the right error. I'll probably add some type annotations to communicate that strings are required for the parameters.

dealings with OSS maintainers who have taken a nerdview approach to their interfaces

This project too takes the nerdview approach for much of its interface, mainly because it's not meant to be a user interface, but to be a library for developers and historically puts a lot of onus on the developer to provide clean inputs.

jaraco commented 2 years ago

In #165 and subsequently in #166, I've implemented argument validation using Pydantic. I'm leaning toward simply disallowing empty strings as arguments and making that a documented and tested expectation. I realize that puts (keeps) the onus on the caller to clean their inputs, but I believe that's the right responsibility for this library. It doesn't strike me as meaningful to compare or inflect or pluralize the empty string.