Closed munyanjacob closed 2 years ago
A pull request sounds good to me!
FYI it sounds like you can avoid this kind of issue on Windows, if you're running python 3.7+, by setting the -X utf8 command line option or PYTHONUTF8 environment variable, and that's probably the best default for most users since it makes Windows match the behavior of Mac and Linux. But no harm in patching reporters-db as well.
Something is funky here, no? I thought Python was always utf-8 these days unless overridden? It'd surprise me if Windows is overriding this by default. (I'm not an expert here.)
Something is funky here, no? I thought Python was always utf-8 these days unless overridden?
You're right - the document jcushman sent says Python uses utf-8, and my default encoder is utf-8, but it turns out there is an exception where Python's open()
uses the locale encoder. Although all of my locale variables are set to use utf-8, my locale encoder is cp1252:
Adding -X utf8
or using export PYTHONUTF8=1
seem to work because they override the locale setting:
I will make a pull request adding encoding="utf-8"
when loading laws.json in case other users are cursed by weird Windows locale encoders. Thanks both of you!
Thank you again. I think I just muddied the waters, so sorry about that. Happy to have any other fixes like this that your see.
My fork fails
tests.py
due to all § symbols inlaws.json
being converted to § when imported. Updating__init__.py
to openlaws.json
usingencoding="utf-8"
fixes it and allows it to past the tests.I am on Windows, so maybe that is why it only seems to be an issue for me. I have not tried reproducing this on other machines, but it is reproducible on mine. I first discovered it while using eyecite.
I'm new to open source - would this have been a good thing to create a pull request for? Please give any advice or feedback! Thanks