jandrew / Spreadsheet-XLSX-Reader-LibXML

Read spreadsheet files with xlsx extentions
Other
4 stars 2 forks source link

Handle SpreadsheetML (Excel 2003) xml data #4

Open jandrew opened 9 years ago

jandrew commented 9 years ago

This is the pre-2007 Microsoft spreadsheet format for saving sheets as XML. The data is saved in a non-zipped xml file with with an xls or xml extension. Since Excel won't read the data with an .xls extension I won't try to differentiate either. In the wild I have mostly seen this data as the export from MySQL when the user selects 'export->excel'. Since .xml isn't an extension unique to microsoft this package will assume that some tabular data is available in the xml file and attempt to extract it as much as possible into rows and columns for reading.

The long term goal is for the package to (in this order)

jandrew commented 9 years ago

Requires completion of #3

jandrew commented 9 years ago

It appears that one source of this tab data targeted at the Excel application is MySQL workbench. I have a feeling there is a Python generator out there somewhere that spits out flat xml instead of zipped files also. If anyone has examples of this type of file I would like to use them in testing.

jandrew commented 9 years ago

OK so the second bullet appears to need a bigger rewrite than I expected and the reading/learning I am doing in Head First Design Patterns isn't making this a smaller task! In any case I have some time to work on this over the weekend but I may not finish and as a consequence I might push the milestone delivery to early May.

jandrew commented 9 years ago

OK, Just an update on this issue. The move to portland and a raft of issues with this package needed to integrate it with Spreadsheet::Read have set this back substantially. This is tied to milestone v0.40.2 which was supposed to follow v0.38.8 but I have already released through v0.38.16 with a set of open issues for v0.38.18. I have rolled an update to the base xml handling into the main branch that should make this go smoother in the long run but I'm still a ways out.