Closed stsewd closed 6 years ago
I forgot to make it py3 compatible
You can also look to some of the tests that probably already mock out the find_one calls. You shouldn't need to depend on a local repo file for the test, though this isn't a huge problem.
I tried to use apply_fs
to reproduce the bug, but I wasn't able to do it (even copying the same name of the file from this test). Even manually creating a file with the same name (copying it from the one that fails).
I think probably this is due a bad encoding or a corrupted file from the user? Even on my OS the file isn't show correctly.
So you probably aren't able to reproduce this easily as your encoding is not ascii
. Here is what the servers give us back:
>>> import sys
>>> sys.getdefaultencoding()
'ascii'
That is bad, and in fact, I'm not sure why that's happening, as our locale is set:
$ locale
LANG=en_US.UTF-8
LANGUAGE=
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=en_US.UTF-8
What you're testing might not be the right thing to test. Also, you are having trouble creating a file or path with this codepoint because your encoding is likely 'utf-8' already. With the encoding ascii
, I have no problem doing this with py.path
and apply_fs
. If you try the following on a box with default utf8 encoding, you'll get a file with a codepoint that isn't 0xf0
:
In [1]: import os
In [2]: os.mkdir('/tmp/foo/')
In [3]: os.mkdir('/tmp/foo/f\xf0\xf0')
In [4]: ls /tmp/foo
f??/
In [5]: os.listdir('/tmp/foo')
Out[5]: ['f\xf0\xf0']
To make this more confusing, however, even with this I'm not able to reproduce the problem with find_all
:
In [18]: path
Out[18]: local('/tmp/foo')
In [19]: path.listdir()
Out[19]: [local('/tmp/foo/f\xf0\xf0'), local('/tmp/foo/bo\xf0')]
In [20]: list(find_all('/tmp/foo', ['readthedocs.yml']))
Out[20]: []
So, I'm pretty confused at this point.
So you probably aren't able to reproduce this easily as your encoding is not ascii. Here is what the servers give us back:
I got the same on my local instance
>>> import sys
>>> sys.getdefaultencoding()
'ascii'
I asked to the user for more information https://github.com/rtfd/readthedocs.org/issues/3732#issuecomment-371110187, and I think probably that file was generated with some weird encoding that only Windows understand (I have seen some Windows files showing as invalid encode on my machine a couple of times). Also now that I remember, a couple of weeks ago I was helping to a German friend to setup his rtd instance and he was using Windows, and faced some similar problems with encoding (but on other part of the build).
The fix for python encoding being crazy is "use python3". I've tried to fix the server encodings a million times, and it doesn't work. We just need to move to an all UTF-8 world.
Looks like a good test to have, so going to merge this.
Test to expose fix for #27
This test was a little hard to figure out how to do it, since I wasn't able to use the name on the
py
file, so I had to add an actual file (I take the file from https://github.com/rtfd/readthedocs.org/issues/3732#issuecomment-370650285) and I had to made it py2 and py3 compatible. Please let me know if there is a better way.