flying-circus / pyfilesystem

Automatically exported from code.google.com/p/pyfilesystem
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

MemoryFS doesn't support non-ascii encodings #155

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Instantiate a fs.memoryfs.MemoryFS
2. Open a file from the MemoryFS instance with a write mode
3. Write a unicode string containing characters that are not representable in 
ascii.

What is the expected output? What do you see instead?

The MemoryFS should allow arbitrary unicode strings in the body of files. 
Instead, it throws a UnicodeEncodeError attempting to encode the written 
unicode string to ascii.

What version of the product are you using? On what operating system?

On Mac OS X Lion, C Python 2.7.3, a copy of pyfilesystem from July 2nd (forked 
to a private repository but not meaningfully changed)

Please provide any additional information below.

I was able to resolve my issue by simply placing

from __future__ import unicode_literals

at the top of memoryfs.py. I'm not sure if that has implications for other 
clients. 

Original issue reported on code.google.com by abishopr...@box.com on 10 Jul 2013 at 4:17

GoogleCodeExporter commented 9 years ago
Seems to work for me. Maybe it was fixed since you forked.

>>> from fs.memoryfs import *
>>> m=MemoryFS()
>>> f=m.open('jp.txt', 'w')
>>> f.write(u'私は学生です')
6L
>>> f.close()
>>> m.tree()
╰── jp.txt
>>> m.getcontents('jp.txt')
'\xe7\xa7\x81\xe3\x81\xaf\xe5\xad\xa6\xe7\x94\x9f\xe3\x81\xa7\xe3\x81\x99'
>>> m.getcontents('jp.txt', 'rt')
u'\u79c1\u306f\u5b66\u751f\u3067\u3059'
>>> print _
私は学生です

Can you still reproduce the error?

Original comment by willmcgugan on 3 Sep 2013 at 1:17

GoogleCodeExporter commented 9 years ago
I copied in the most recent version of pyfilesystem and using your script got 
the same error (I moved the example script directly into a copy of the svn 
read-only pyfilesystem):

# -*- coding: utf-8 -*-

from memoryfs import *
m=MemoryFS()
f=m.open('jp.txt', 'w')
x = u'私は学生です'
f.write(x)

f.close()
m.tree()

--> UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-5: 
ordinal not in range(128)

Are you using a different encoding at the top of the file? I'm also on Python 
2.7.3, so I'm not getting unicode literals everywhere for free.

Original comment by abishopr...@box.com on 6 Sep 2013 at 3:30

GoogleCodeExporter commented 9 years ago
That works fine for me. Only difference is I'm on Linux.

That won't work as a way of using the svn version though. memoryfs has a bunch 
of "from fs." imports that will import the installed code. Best to run "python 
setup.py develop".

Original comment by willmcgugan on 6 Sep 2013 at 4:00

GoogleCodeExporter commented 9 years ago
Ok, I'll keep digging on this. Thanks for helping Will.

Original comment by abishopr...@box.com on 6 Sep 2013 at 4:05