18F / autoapi

A basic spreadsheet to api engine
Other
42 stars 18 forks source link

Confine imports to CSV files only for now #84

Open vrajmohan opened 8 years ago

vrajmohan commented 8 years ago

Copying Slack conversation:

@vrajmohan : In autoapi, is there a real use case for parsing Excel and JSON files, or is it YAGNI ?

@toolness : i'm curious about whether there's an excel use case. i find unicode in csv very confusing and i am guessing there's lots of room for error--if folks are converting their excel files to csv and then uploading, it might help reduce encoding errors if we just took in the xls/xlsx. but i have no idea if they're working with excel natively or not. (i suppose another alternative might be directly pulling from a google spreadsheet, too, if they use that...)

@gbinal : excel is a nice to have but I'd be fine with deprecating if that brought us something over just not documenting. the use case i can think of for JSON would be trying to make an API out of something more complex than a can be represented in a spreadsheet

I have a few points to make:

  1. The current code fails on a very basic .xslx file.
  2. Exporting tabular spreadsheet files to CSV is fairly trivial and can be done by the spreadsheet applications themselves. Taking on the parsing of these formats and doing it better than the spreadsheet applications themselves is ambitious.
  3. In the case of JSON, if the structure is more complicated than what can be represented in a spreadsheet, we have to specify and build code to transform these into (multiple) DB tables.

I suggest removing this feature and tackling it, if at all, at a later date.

toolness commented 8 years ago

Oh that's funny, I had no idea that autoapi already attempted to handle excel files!

The one point I might have quibbles with is (2), as I'm befuddled by the character encoding Excel uses when exporting to CSV. As long as non-ASCII characters don't get mangled when importing CSV, that's great, but this hasn't always been the case for me... Granted, I do agree that writing our own XLS importer to get around the encoding issue probably does open up a whole new can of worms!

gbinal commented 8 years ago

Before we affirm this, I want to try some JSON files first.