thouis / numpy-trac-migration

numpy Trac to github issues migration
2 stars 3 forks source link

genfromtxt issue with EOL and/or unicode (migrated from Trac #1473) #3025

Closed thouis closed 12 years ago

thouis commented 12 years ago

Original ticket http://projects.scipy.org/numpy/ticket/1473 Reported 2010-05-04 by atmention:vincentdavis, assigned to unknown.

Basic problem, I file saved on a Mac using excel as a csv cannot me opened with genfromtxt. This will work, but if not expected to be necessary. f = file('x.csv', 'U') genfromtxt(f, ...)

Stéfan van der Walt has fixed the same issue in loadfromtxt http://projects.scipy.org/numpy/changeset/8375

Long Story: I ran into this issue and it was discussed on the pystatsmodels mailing list. Here is the setup Running on a Mac 10.6 Using Office 2008 Saving an spreadsheet using excel "save as" a csv file.

Try to import using genfromtxt fails, report a EOL error I thought this was because the EOL was wrong, It seems the file has '\r' as the line ending (this may be wrong) anyway I changed it to '\n' and it works fine. I am told (on the pystatsmodels mailing list) that this is actually because the file is in unicode and that genfromtxt does not read the EOL correctly.

To me it is a bug because one might expect a user to what to save a file from excel and read it using genfromtxt. And for useres with little experience the problem is not obvious.

I guess this is not a problem with py3?

ORIGINAL ATTEMPT

datatype = [('date','|S9'),('gpd','i8'),('temp','i8' ('precip','f16')] data = np.genfromtxt('waterdata.csv', delimiter=',', skip_header=1, dtype=datatype)

Traceback (most recent call last): File "/Applications/WingIDE.app/Contents/MacOS/src/debug/tserver/_sandbox.py", line 1, in

Used internally for debug sandbox under external interpreter

File "/Library/Frameworks/EPD64.framework/Versions/6.1/lib/python2.6/site-packages/numpy/lib/io.py", line 1048, in genfromtxt

raise IOError('End-of-file reached before encountering data.')

IOError: End-of-file reached before encountering data.

THIS DOES NOT WORK

s = file('data_with_CR.csv','r') data = np.genfromtxt(s, delimiter=",", skip_header=1, dtype=None) Traceback (most recent call last): File "", line 1, in File "/Library/Frameworks/EPD64.framework/Versions/6.1/lib/python2.6/site-packages/numpy/lib/io.py", line 1048, in genfromtxt raise IOError('End-of-file reached before encountering data.') IOError: End-of-file reached before encountering data. data = np.genfromtxt(s, delimiter=",", , dtype=None) File "", line 1 data = np.genfromtxt(s, delimiter=",", , dtype=None)

THIS DOES WORK

s = file('data_with_CR.csv','U') data = np.genfromtxt(s, delimiter=",", skip_header=1, dtype=None) data array([('1/1/00', 8021472, 52, 0.02), ('1/2/00', 9496016, 46, 0.059999999999999998), ('1/3/00', 8478792, 29, 0.0), ..., ('12/29/02', 10790000, 61, 0.0), ('12/30/02', 9501000, 44, 0.0), ('12/31/02', 9288000, 53, 0.0)], dtype=[('f0', '|S8'), ('f1', '<i8'), ('f2', '<i8'), ('f3', '<f8')])

thouis commented 12 years ago

Attachment in Trac by atmention:vincentdavis, 2010-05-04: data_with_CR.csv

thouis commented 12 years ago

Comment in Trac by atmention:pierregm, 2010-05-16

OK, we can force a file to be opened in 'U' mode, which should solve the problem. I implemented that in r8416, let me know how it goes before I close the ticket.

thouis commented 12 years ago

Comment in Trac by atmention:vincentdavis, 2010-07-04

This works for me now. Thanks for the fix Pierre

Vincent

thouis commented 12 years ago

Comment in Trac by atmention:rgommers, 2011-03-29

works for me too

thouis commented 12 years ago

Comment in Trac by atmention:mwiebe, 2011-05-31