rsheets / rexcel

Extracts spreadsheet data from Excel workbooks and puts into linen format
50 stars 5 forks source link

References #10

Open jennybc opened 8 years ago

jennybc commented 8 years ago

Some decent overview docs about xlsx that are substantially less than 6000 pages long 🎉

http://officeopenxml.com/anatomyofOOXML-xlsx.php

https://msdn.microsoft.com/EN-US/library/office/gg278316.aspx

(links found here: http://www.digitalpreservation.gov/formats/fdd/fdd000398.shtml)

richfitz commented 8 years ago

Yup, there's lots of this sort of stuff there, and it's useful. I started with these, but I kept on introducing bugs with new sheets because I wasn't sure if an element would always be present, or what the universe of relevent attributes for a tag was. So while these are a good place to start eventually the spec is needed :cry:

jennybc commented 8 years ago

Interesting to read, in the sense that even reading the docs reveals some about their underlying workbook object. "XlsxWriter is a Python module for creating Excel XLSX files."

http://xlsxwriter.readthedocs.io/index.html

jennybc commented 8 years ago

Curiously informative. "How do I ...?" from the Spreadsheets section of Microsoft's docs on Open XML SDK:

https://msdn.microsoft.com/en-us/library/cc850837.aspx

Good example where you can confirm knowledge about which files do what and which are required:

https://msdn.microsoft.com/en-us/library/bb507946.aspx

This seems to be less miserable than reading the spec.