Closed davidhewitt closed 2 weeks ago
Thanks for the approval @LilyFoote!
I just realised: there's a bit of an edge case here: what about when the Bound<'py, PyString>
contains a subclass of str
? That subclass can override inequality, and might not compare equal with strings (although that would be very weird).
What should we do then? One option is to make these implementations return false for subclasses (e.g. do an is_exact_instance_of::<PyString>()
check).
In PyUnicode_EqualToUTF8
there is no subclass checking - https://github.com/python/cpython/blob/42351c3b9a357ec67135b30ed41f59e6f306ac52/Objects/unicodeobject.c#L10789
I'm inclined to proceed with this as-is, but I would also be glad to hear folks' opinions as there is a slight possibility this is a footgun which leads to surprises in the subclass edge case.
I think subclassing built-in types like str
is fairly well known to have edge-cases and oddities like this. Maybe all we need here is a note in the docs.
I think subclassing built-in types like
str
is fairly well known to have edge-cases and oddities like this. Maybe all we need here is a note in the docs.
True! I think to that point, it's very likely that users would write bound_str.to_str()? == "rust string"
which is just as wrong for subclasses as what we implement here. So documenting gotcha on the trait implementations as you suggest and moving on might be fine.
Any other opinions before we proceed?
Thanks both! I've added documentation and will proceed to merge 👍
Idea spurred by #4020 and also what I've just been doing in #4196.
This PR implements comparison between Python
str
and Rust&str
by using:PyUnicode_EqualToUTF8AndSize
on Python 3.13,&str
and doing equality that way on older versions.PyUnicode_EqualToUTF8AndSize
can never fail, so similarly the fallback implementations just returnfalse
on error getting UTF8 data out.