necaris / cuid.py

A fast, scalable unique ID generator for Python
BSD 2-Clause "Simplified" License
64 stars 8 forks source link

Fixed ascii/unicode bug in README.md #14

Closed zyli5313 closed 5 years ago

zyli5313 commented 5 years ago

Hi necaris,

This is a bug report and I feel like it's good to have my fix.

problem

the μs in README.md is a non ascii character which leads to failing to pip install cuid.py . You can refer to the sample bash output below

possible solution

changed non ascii char μs to ascii char us

(py36) [lizy004@cnsz92pl00191 cuid-0.3]$python setup.py install
Traceback (most recent call last):
  File "setup.py", line 8, in <module>
    README = r.read()
  File "/home/lizy004/py36/lib/python3.6/encodings/ascii.py", line 26, in decode
    return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xce in position 591: ordinal not in range(128)

(py36) [lizy004@cnsz92pl00191 cuid-0.3]$python --version
Python 3.6.8

(py36) [lizy004@cnsz92pl00191 cuid-0.3]$locale
locale: Cannot set LC_CTYPE to default locale: No such file or directory
locale: Cannot set LC_ALL to default locale: No such file or directory
LANG=en_US.UTF8
LC_CTYPE=UTF-8
LC_NUMERIC="en_US.UTF8"
LC_TIME="en_US.UTF8"
LC_COLLATE="en_US.UTF8"
LC_MONETARY="en_US.UTF8"
LC_MESSAGES="en_US.UTF8"
LC_PAPER="en_US.UTF8"
LC_NAME="en_US.UTF8"
LC_ADDRESS="en_US.UTF8"
LC_TELEPHONE="en_US.UTF8"
LC_MEASUREMENT="en_US.UTF8"
LC_IDENTIFICATION="en_US.UTF8"
LC_ALL=
necaris commented 5 years ago

@zyli5313 sorry for the delay in responding to this. It looks like this might be because of your LC_CTYPE environment variable? I'm not able to reproduce this normally, but if I set the LC_CTYPE to the same value you have, I can:

$ export LC_CTYPE="UTF-8"
bash: warning: setlocale: LC_CTYPE: cannot change locale (UTF-8): No such file or directory
$ python setup.py install
Traceback (most recent call last):
  File "setup.py", line 8, in <module>
    README = r.read()
  File "/home/rami/.anaconda3/envs/cuid.py/lib/python3.6/encodings/ascii.py", line 26, in decode
    return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xce in position 591: ordinal not in range(128)
$ export LC_CTYPE="en_US.UTF-8"
$ python setup.py install
running install
running bdist_egg
running egg_info

I don't mind changing it to something ASCII to avoid issues for anyone else who has similar configuration issues. But to me "us" doesn't indicate microseconds -- can you change the label & values to milliseconds ("ms"), and I'll merge the pull request?

zyli5313 commented 5 years ago

ok understood. Sorry about that! Thanks will do

zyli5313 commented 5 years ago

Hang on.

ms=millisecond (10^3 ms = 1 sec)
us=microsecond (10^6 us = 1sec)

I believe originally you had μs in the README which means us. So I think my original changes are right. What do you think?

necaris commented 5 years ago

I'm sorry, I wasn't clear. As I mentioned in my previous comment, us doesn't indicate microseconds to me, since μs is the correct symbol -- although Wikipedia says us is a common simplification. I personally find it confusing.

So rather than change μs to us, I suggest changing the labels and values to milliseconds so we can use the ms abbreviation -- in other words changing the unit of measurement. I would also be happy to change the unit of measurement to nanoseconds (ns) which is probably a better unit for timings.

necaris commented 5 years ago

@zyli5313 please see https://github.com/necaris/cuid.py/pull/17 which changes the units used during the benchmarking script to nanoseconds -- can we agree nanoseconds would be OK?

zyli5313 commented 5 years ago

Hi @necaris , that makes sense. I changed to ns in README and committed here. Please let me know if there is problem. Thanks!

necaris commented 5 years ago

@zyli5313 thank you for the patience with this PR, just merged!

zyli5313 commented 5 years ago

Thanks for your help! @necaris