davedelong / CHCSVParser

A proper CSV parser for Objective-C
Other
1.27k stars 255 forks source link

Possible problem with sniffencoding? #45

Open hankinsoft opened 11 years ago

hankinsoft commented 11 years ago

Given the following csv file: https://mega.co.nz/#!BklVVCSR!JSN_8SIjPfz4eyHdK87H-1U2wyNmC-JkvsDthE-peII

Which has 415 lines and no unicode characters until line 150, the following code "fails": [NSArray arrayWithContentsOfCSVFile: "file path"];

"fails" in quotes as the method succeeded, but the array only contains 149 entries. At the point it gets to the unicode encoding it silently fails. This gives a false impression that everything was successful (and takes a long time to figure out what is going on :)

lupiter commented 11 years ago

I'm experiencing a similar problem, which I 'fixed' by replacing line 78:

 NSStringEncoding encoding = [csv fastestEncoding];

with

 NSStringEncoding encoding = NSUTF8StringEncoding;

Which basically reverts dd43455 and may re-introduce Issue #21, but it does mean the whole thing loads.

aeberbach commented 10 years ago

I'm parsing a CSV file generated with Numbers 3.1, using the default "Unicode (UTF8)" option when exporting to CSV.

NSStringEncoding encoding = [csv fastestEncoding];

when applied to this file results in NSUnicodeStringEncoding selected in initWithCSVString. For many files this works, but for some it starts displaying chinese characters. Using

NSStringEncoding encoding = [csv smallestEncoding];

results in NSMacOSRomanStringEncoding, and everything parses OK. Apple bug?