piranna / pyfilesystem

Automatically exported from code.google.com/p/pyfilesystem
BSD 3-Clause "New" or "Revised" License
1 stars 0 forks source link

allowing OSFS.open to specify 'U' mode #160

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Hi All,

Awesome projects. It is going to be soooooo useful.

I recently have a slight issue with opening a CSV file with OSFS. The CSV 
reader had a hissy fit about line endings. The solution was to open it with the 
mode set to 'rU'. Currently OSFS open does not allow for the U mode.

Attached is a super simple patch to allow the 'U' mode whic fixed my issue

Original issue reported on code.google.com by Xir...@gmail.com on 3 Sep 2013 at 1:47

Attachments:

GoogleCodeExporter commented 9 years ago
Is this something specific to OSFS, or could/should a similar patch be applied 
to other OSes too?

PyFileSystem does its best to try and treat all FSes identically as much as 
possible :)

Original comment by gc...@loowis.durge.org on 3 Sep 2013 at 2:21

GoogleCodeExporter commented 9 years ago
The U mode is not recommended for new code, according to the docs. It's also 
not implemented in a variety of other file-like objects. Which is why we don't 
support it.

The 'newline' parameter replaces it. Setting it to None should be the same as U 
mode (for text files). This is actually the default, so I'm puzzled why this 
patch would fix your issue.

What error message was the CSV code giving you?

Original comment by willmcgugan on 3 Sep 2013 at 12:43

GoogleCodeExporter commented 9 years ago
This stack overflow question by someone else basically sums up the issue I was 
having:

http://stackoverflow.com/questions/6726953/open-the-file-in-universal-newline-mo
de-using-csv-module-django

If there is an alternative solution I'd be more than glad to hear it.

Cheers,
Jon

Original comment by Xir...@gmail.com on 4 Sep 2013 at 5:08

GoogleCodeExporter commented 9 years ago
Can you post the original error you were getting, and if possible enough of the 
CSV so I can reproduce the error. And let us know which version of Python / 
PyFilesystem you are using. I can't seem to reproduce the issue here.

Hacking in the U mode may appear to have worked. But I suspect you will get a 
different class of errors with other CSV files.

Original comment by willmcgugan on 4 Sep 2013 at 8:54

GoogleCodeExporter commented 9 years ago
The original error I am getting is:

Traceback (most recent call last):
  File "/home/mossj/.virtualenvs/test-passbook/local/lib/python2.7/site-packages/django/core/management/base.py", line 222, in run_from_argv
    self.execute(*args, **options.__dict__)
  File "/home/mossj/.virtualenvs/test-passbook/local/lib/python2.7/site-packages/django/core/management/base.py", line 255, in execute
    output = self.handle(*args, **options)
  File "/home/mossj/Workspaces/test-passbook/www/test/management/commands/import_passbook_users.py", line 36, in handle
    report_path = self.process_file(user_file)
  File "/home/mossj/Workspaces/test-passbook/www/test/management/commands/import_passbook_users.py", line 72, in process_file
    for row in reader:
  File "/usr/lib/python2.7/csv.py", line 103, in next
    self.fieldnames
  File "/usr/lib/python2.7/csv.py", line 90, in fieldnames
    self._fieldnames = self.reader.next()
Error: new-line character seen in unquoted field - do you need to open the file 
in universal-newline mode?

I cannot send the original file as it contains private info. I have however, 
created a mock file that suffers the same issue. It should be attached. The 
line endings appear to be 0x0D characters, i.e. just carriage returns (I blame 
my Mac wielding colleague ;-).

I am using python 2.7.4, and PyFilesystem 0.4.0

Cheers,
Jon

Original comment by Xir...@gmail.com on 5 Sep 2013 at 3:27

Attachments:

GoogleCodeExporter commented 9 years ago
Hi Jon,

I think I've figured this out, and it's a bit messy. According to the docs, csv 
on Py2.7 requires bytes and not unicode. But! the io module (which is what osfs 
uses) doesn't do any kind of universal new lines processing in binary mode.

Given that we can't support U mode, the solution would be to open in text mode 
(which enables universal line endings) and encode each line as utf-8. Something 
like this:

with osfs.open('members.csv', 'r', newline='') as f:
    for line in csv.reader(line.encode('utf-8') for line in f):
        print line

Bear in mind that the output of that is going to be a list of bytes strings. So 
you will need to decode utf-8 if you want to support unicode in csvs.

On Python 3, you wouldn't need the line.encode because the csv module actually 
works with unicode there.

Will

Original comment by willmcgugan on 5 Sep 2013 at 9:19

GoogleCodeExporter commented 9 years ago
Hi Will,

Understood and thanks for your efforts. Your right it is a bit messy but 
leaving out the deprecated U option does make sense and thanks for your 
solution.

Original comment by Xir...@gmail.com on 24 Sep 2013 at 11:58

GoogleCodeExporter commented 9 years ago
Happy to help.

Original comment by willmcgugan on 25 Sep 2013 at 9:36