Closed dfreniche closed 12 years ago
I will get to this tomorrow or Sunday. For sure the framework and library handle non-ascii characters - I've tested it before. It could be a bug or perhaps you or the framework is not setting some option properly.
Excel will return non-ascii as UTF-16, the same as UIKit and Cocoa use natively in NSString. That was great that you provided a test file too!
Great! Thanks a lot for your time and knowledge!
BTW I made a second class method to specify the encoding of the file, but no luck:
+ (DHxlsReader *)xlsReaderFromFile:(NSString *)filePath withEncoding:(NSString *)encoding
{
DHxlsReader *reader;
xlsWorkBook *workBook;
// NSLog(@"sizeof FORMULA=%zd LABELSST=%zd", sizeof(FORMULA), sizeof(LABELSST) );
const char *file = [filePath cStringUsingEncoding:NSUTF8StringEncoding];
if((workBook = xls_open(file, encoding))) {
reader = [DHxlsReader new];
[reader setWorkBook:workBook];
}
return reader;
}
The original method is now:
+ (DHxlsReader *)xlsReaderFromFile:(NSString *)filePath
{
return [DHxlsReader xlsReaderFromFile:filePath withEncoding:@"UTF-8"];
}
Well, this is all very interesting. Supposidly, in BIFF8 (newer Excel), any non-ASCII characters get written in an UTF-16 format in Excel. However, when I look at your file, the data is there in "clear" text with UTF (i.e. non ascii) characters. This is going to take some more investigation. Can you tell me just how you made this .xls file? That is, with what program (Excel ???)
The Excel format doc swears that once the file is BIFF8 (i.e. relatively new) that non-ASCII strings are stored as UTF16, but in your file there are non-ascii chars (for those names) but they are stored as plain strings (i.e. ascii).
That said, of course Excel itself knows how to read it properly!
Sorry! I forgot to mention how I made the XLS file. It's a xlsx file (created with Excel 2010 in Windows) and saved in XLS format using "Save as" in LibreOffice. But I've tried saving as using Word 2004 for Mac, with no luck
Maybe the problem is with the conversion?
Options, then?
Thanks a lot!
El 24/03/2012, a las 02:12, David Hoerl reply@reply.github.com escribió:
Well, this is all very interesting. Supposidly, in BIFF8 (newer Excel), any non-ASCII characters get written in an UTF-16 format in Excel. However, when I look at your file, the data is there in "clear" text with UTF (i.e. non ascii) characters. This is going to take some more investigation. Can you tell me just how you made this .xls file? That is, with what program (Excel ???)
The Excel format doc swears that once the file is BIFF8 (i.e. relatively new) that non-ASCII strings are stored as UTF16, but in your file there are non-ascii chars (for those names) but they are stored as plain strings (i.e. ascii).
Reply to this email directly or view it on GitHub: https://github.com/dhoerl/DHlibxls/issues/2#issuecomment-4670717
OK - I found the problem. Its a really obscure issue with Excel's UTF encoding that the library just didn't understand. I'm working on a fix for it. I'm sure others have complained about this in the past and we just assumed it was "Operator Error".
libxls has been updated to fix the utf problem. Likewise the Framework was slightly updated to comment out logs etc.
It should work perfectly now. If not I'm sure I'll hear from you!
Thanks a lot! Can't express my gratitude for working on that on a weekend!
Will check it tomorrow, close the issue and give you a lot of accented XLS files to play with, so you have a good testing bed
Thanks again!
Enviado desde mi iPhone
El 24/03/2012, a las 17:25, David Hoerl reply@reply.github.com escribió:
libxls has been updated to fix the utf problem. Likewise the Framework was slightly updated to comment out logs etc.
It should work perfectly now. If not I'm sure I'll hear from you!
Reply to this email directly or view it on GitHub: https://github.com/dhoerl/DHlibxls/issues/2#issuecomment-4674623
On 3/24/12 2:38 PM, Diego Freniche wrote:
Thanks a lot! Can't express my gratitude for working on that on a weekend!
Will check it tomorrow, close the issue and give you a lot of accented XLS files to play with, so you have a good testing bed
Thanks again!
Enviado desde mi iPhone
Well, thanks for the thanks! Cannot work on this at work, so weekends as good a time as any! Don't worry about more files - you test it. The issue is that your strings just used UNICODE code points where were <
The code just assumed any string that was 8-bit was ASCII. It took me a long time to get to this conclusion. Once I figured that out it was all smooth sailing from them on. I even did the UTF-8 conversion in code and didn't use iconv library.
Everything worked OK! I'm stocked, my import from XLS into Core Data (sqlite) is working like a charm!
Thanks again for the good work
Closing issue
I'm trying to parse an xlsx file instead of the test file in the library, but not working. It logged "Not an excel file".
This library supports xls only! xlsx is a completely different format.
BTW: Please make a new issue on github unless your problem is directly related to another issue.
Hi. I'm struggling with this and trying to read an Excel file with accented chars (ISO-8859-1 chars), for an iOS App.
The chars are like these: á, é, ñ, etc.
I've tried almost everything changing and tweaking the code. But as far as I understand, internally Excel (and libxls) is using UFT-8, but Cocoa's NSStrings are UTF-16. In the conversion, I always get nil. So when you try to read the contents of a cell and the String as one of these chars, boom! nil
I've been using the last version of your code, pulling libxls correctly from svn, etc. In the log window I can see: CellType: cellString row=2 col=B/2 string: Espana
But if I change the content of the cell to "España", I obtain nil
The test file I've been using is here: http://dl.dropbox.com/u/1012348/test.xls
Can you please help me?
And, BTW: nice wrapper!