SciRuby / daru-io

daru-io is a plugin gem to the existing daru gem, which aims to add support to Importing DataFrames from / Exporting DataFrames to multiple formats.
http://www.rubydoc.info/github/athityakumar/daru-io/master/
MIT License
25 stars 9 forks source link

Post GSoC: Steal like an artist #56

Open zverok opened 7 years ago

zverok commented 7 years ago

There are some gems gaining popularity recently, whose task could be solved (probably with more grace!) with daru+daru-io.

Let's look at them and consider what useful ideas we can borrow: sometimes for new features, sometimes for showcases. List will probably grow!

  1. SpreadsheetArchitect -- ActiveRecord addon to export models to Excel. Could be done by daru-io in its current state. So, it is a matter of probably writing a blog post demonstrating our approaches to those problems ;) A really good chance to showcase daru-io, because a lot of people are talking about the gem recently.
  2. Xport -- also AR-to-Excel exporter. Unlike the above gem, also allows to setup cells style, which we still can't (but probably should?)
  3. Saxlsx from the same author -- (pretends to be) really quick Xlsx parser. Probably can be integrated into Xlsx importer? (Before integration, some measurements should be invented and checked, to understand if it is worth it → and generally speaking, speed tests for exporters and importers is probably idea for another GitHub issue)
  4. Trick for fast importing CSV into Postgres (IDK if it is really useful for us, just leaving it here)
  5. Cloudxls -- cloud (?) XLS-creation service. Don't use it, just look at their examples and what they are advertising (about convertion of "messy CSS" to "pretty Excel")
athityakumar commented 7 years ago

Sure, @zverok. Feel free to expand this list. 😄

Regarding (1), (2) and (3), our Excel exporter is currently only supporting xls format and could definitely use an upgrade with support to xlsx format. And yes - it's better to add benchmarks (and close #51 😉 ) before deciding on which gem dependency to use for new Importers / Exporters.

I had a look at Trick, and it's speed of importing csv into postgresql seems to be VERY impressive. It'd be awesome if we could have a generic csv importer that's this fast. (related to #31)

zverok commented 7 years ago

More:

  1. simple_report is a gem for building simple reports from Enumerable collections.
  2. How to reformat CSV files with Kiba ETL?

We definitely should not mimic their API, but it would be interesting to investigate: