pandas-dev / pandas

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
https://pandas.pydata.org
BSD 3-Clause "New" or "Revised" License
42.62k stars 17.57k forks source link

Add "N / A" to `STR_NA_VALUES` #59150

Open NickCrews opened 4 days ago

NickCrews commented 4 days ago

This is a value that I just came across in some data that is in the spirit of the other NA value. I don't think this will be a false positive any more so than some of the other values in this list.

NickCrews commented 4 days ago

PS, I think it is usually bad taste for a test to exactly copy the contents of the implementation like this. Can I just make it so we reference STR_NA_VALUES directly in the test, and remove _NA_VALUES from that test?

WillAyd commented 3 days ago

Thanks for the PR but its not really a feasible goal for us to track down every NA value a user could come across, and dealing with whitespace like this is not something I see of being value added to pandas. I think you may just need to handle this in your own process

NickCrews commented 3 days ago

That is definitely true, I already have a selection of additional values I check for. This is just the most egregious one that I thought would.be applicable.for everyone.

Even if we don't get to perfection, I think this PR still improves the situation and is a good contribution. Are you worried that slowly this list will balloon to 100 different examples? Trying to figure out if there is some mitigation o can do to address your concern. Thank you, I appreciate your time and effort!