Open andylolz opened 10 years ago
I would v. much like to pass in this excel sheet [1] as the url and then drop the nonsense headers with a datapipes transform...
@davidmiller issue is we need excel parsing in node and it doesn't seem to exist (maybe for xlsx) ...
Pass off to http://okfnlabs.org/dataconverters/ As-A-Service?
@davidmiller sure but we need that deployed "as a service" :-) (easy to do but needs a small bit of work i imagine).
we need excel parsing in node and it doesn't seem to exist (maybe for xlsx) ...
@rgrp shameless plug: xlsjs on npm is an XLS parser (the javascript also works in-browser: http://oss.sheetjs.com/js-xls/ )
@SheetJSDev that is awesome :-) We'd love to use this if that was ok :-)
It's Apache 2.0 licensed and the source is on github ( https://github.com/SheetJS/js-xls ) so there really shouldn't be a problem.
@SheetJSDev this is absolutely fantastic. Please say what kind of credit you'd like us to have on the site.
@davidmiller would you be up for having a go at an incoming parser based on this?
Entirely possible.
What's the status of implementing all the transforms etc as fail-early streams - that was the major issue last time I was paying close attention?
@davidmiller fail early streams is ongoing in #110 but it wouldn't be a blocker for this (i mean we can't stream an excel file anyway in the true sense since you need to read the whole file to use IIRC).
Also - you're less likely to get 12GB excel files - so less of an issue here one suspects.
Can turn into a streamable for downstream consumption as a reasonable compromise
e.g. this: http://datapipes.okfnlabs.org/none?url=https://github.com/okfn/messytables/raw/master/horror/simple.xls