Tux / Spreadsheet-Read

Meta-Wrapper for reading spreadsheet data with perl5
15 stars 17 forks source link

Companion module: Spreadsheet::Read::Ingester #24

Closed sdondley closed 4 years ago

sdondley commented 5 years ago

I wrote a wrapper for Spreadsheet::Read to help speed repeated parsing of the same file.

Details here.

Feedback appreciated. Hopefully it's of some use to others.

Tux commented 5 years ago

If it helps you, use it. I see value for it, though for me it probably won't help: If I need to reparse data often, I convert it to CSV. Nothing is as easy to port to other machines (and other architectures) as CSV, and - using Text::CSV_XS - it is trailblazing fast compared to all the other spreadsheets. @sdondley I'd suggest to use Sereal instead of Storable. Not only for speed. Or at least have an option to choose.

sdondley commented 5 years ago

OK, good to know. Thanks for the suggestions. I will be dealing with converting from Excel directly so it's probably a lot more useful for that.

sdondley commented 5 years ago

I think what I'll do is add an option to save as a csv file as well as a pure data structure.

Tux commented 5 years ago

Spreadsheet::Read comes with xlsx2csv, so feel free to steal from it :) As you can imagine xlsx2csv uses Spreadsheet::Read to read the sheets and Text::CSV_XS to write CSV, so all supportes spreadsheets can be turned into CSV As I wrote my first reply, I realized supporting Sereal was on my TODO for Tie::Hash::DBD, so I just did it. See https://github.com/Tux/Tie-Hash-DBD/commit/a3b1d525d74a1d78677514bbff11c41e7ace2a75 for how transparently that can be done. And I 100% agree that re-opening xlsx files with big datasets is a real PITA

sdondley commented 5 years ago

Thanks, yeah, was just in the middle of researching xls2csv tool as a possible way to implement it. And thanks for the tip on implementing Sereal.

Tux commented 4 years ago

Closing this issue, as there are no open issues for Spreadsheet::Read