tether / roach

A very adaptable web crawler framework. Impossible to kill.
Other
7 stars 1 forks source link

Crawlers should support parsing XLS/XLSX #14

Closed ekryski closed 10 years ago

bredele commented 10 years ago

It's done. I'm using the libraries xls and xlsx from https://github.com/SheetJS. They look stable and pretty legit.

The handlers xls and xlsx transform a readable stream into an xls-array. It's then up to the developer to parse this data.

ekryski commented 10 years ago

Nice fucking find man! Those are the best xls libs so far

bredele commented 10 years ago

Great :) the crawler has everything you requested (xlm, html, csv, zip, xls/xlsx).