felipenoris / XLSX.jl

Excel file reader and writer for the Julia language.
https://felipenoris.github.io/XLSX.jl/stable
Other
275 stars 58 forks source link

Excel data type inlineStr is not supported #123

Closed cmcaine closed 4 years ago

cmcaine commented 4 years ago

I can't access the data of most cells because they are "inlineStr"

julia> sheets[1][1,1]
missing

julia> sheets[1][1,2]
ERROR: Excel data type inlineStr is not supported.
Stacktrace:
 [1] error(::String) at ./error.jl:33
 [2] getdata(::XLSX.Worksheet, ::XLSX.Cell) at /home/colin/.julia/packages/XLSX/LFpq7/src/cell.jl:103
 [3] getdata at /home/colin/.julia/packages/XLSX/LFpq7/src/worksheet.jl:77 [inlined]
 [4] getdata(::XLSX.Worksheet, ::Int64, ::Int64) at /home/colin/.julia/packages/XLSX/LFpq7/src/worksheet.jl:78
 [5] getindex(::XLSX.Worksheet, ::Int64, ::Int64) at /home/colin/.julia/packages/XLSX/LFpq7/src/worksheet.jl:176
 [6] top-level scope at REPL[14]:1

Can't share the sheet because it's confidential and resaving a non-confidential example in the software I have available encodes it differently.

felipenoris commented 4 years ago

@cmcaine , a few of these assertions I put because I couldn't find an instance or scenario where MS Excel generates this kind of tags.

How this excel file is created? Which software is used to create the file?

felipenoris commented 4 years ago

Maybe the performance issue is related to inlineStr. But can you give details on the size of the file, and how much time it takes to load on MS Excel and ExcelReaders.jl?

cmcaine commented 4 years ago

The file is generated by EXWA https://www.elcomsoft.co.uk/exwa.html

Really they should just generate a csv, but whatever.

ExcelReaders takes about 1m per sheet, which makes it about twice as fast.

Libreoffice loads it in 26s:

$ time libreoffice confidential.xlsx 
libreoffice   26.10s user 2.65s system 121% cpu 23.716 total

Don't have access to MS Excel.

felipenoris commented 4 years ago

@cmcaine I see. Is it possible to generate a file with random data with EXWA that triggers this issue?

cmcaine commented 4 years ago

Not easily :(

I'd have to make a database with the same schema as WhatsApp uses, then use EXWA to export that as an xlsx.

I might be able to do this, but no guarantee I'll have the time or ability.

On Thu, 14 Nov 2019, 18:57 Felipe Noronha, notifications@github.com wrote:

@cmcaine https://github.com/cmcaine I see. Is it possible to generate a file with random data with EXWA that triggers this issue?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/felipenoris/XLSX.jl/issues/123?email_source=notifications&email_token=ABNZA6KJY6NEGSFT42XBBN3QTWNS5A5CNFSM4JNM7SF2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEEC5EQQ#issuecomment-554029634, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABNZA6NHHHQJGLZJBPCRA2LQTWNS5ANCNFSM4JNM7SFQ .

cmcaine commented 4 years ago

Here's an example excel file with inline strings, not generated by EXWA, though. Maybe helpful?

TestExcel.xlsx

Found here: https://www.grapecity.com/forums/silverlight-edition/read-generated-excel-with-

felipenoris commented 4 years ago

@cmcaine , yes I can work with that. Thanks!

felipenoris commented 4 years ago

Closed by 80df4427e7834786a827ca312df0c340124311b9.