julianharty / app-store-reviews-app

Android App to analyse reviews and scripts to load historical reviews data into a SQLite database.
Apache License 2.0
3 stars 5 forks source link

Parsing CSV problematic when contents include commas #15

Closed julianharty closed 7 years ago

julianharty commented 7 years ago

The current parsing fails to distinguish between commas separating values and commas in the value e.g. in a quoted string "this, not that". Here's the current code:

String[] nextRow() throws IOException {
    String line = reader.readLine();
    return null == line ? null : line.split(",");
}

I need to implement a more discerning parser. Time to do so about now :)

BTW: it might also be worth adding some robustness checks against the contents being read to detect and cope with malformed CSV files. And finally for this bug/enhancement - make the program robust so it doesn't crash when problems happen (currently the app quits - not what I or the user wants).

dakkad commented 7 years ago

There are some good CSV parsers around. Let's review a few.

On 13 Feb 2017 19:23, "Julian Harty" notifications@github.com wrote:

Closed #15 https://github.com/julianharty/app-store-reviews-app/issues/15.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/julianharty/app-store-reviews-app/issues/15#event-960181792, or mute the thread https://github.com/notifications/unsubscribe-auth/AAZWiYzAGDewvTK19bFopkJqDNgA2ygJks5rcK3CgaJpZM4L-dCx .

julianharty commented 7 years ago

I agree, OpenCSV seems to be one option, however I didn't want the faff of adding a large blob of code for what may be a fairly simple challenge, and this code was just enough to enable me to load the files containing reviews.

How about we create a mini-test to try out some CSV parsers and compare their robustness, capabilities and footprint? Then we can replace my basic code if and when we find one worth the footprint and effort :)

On 15 February 2017 at 21:01, Damien notifications@github.com wrote:

There are some good CSV parsers around. Let's review a few.

On 13 Feb 2017 19:23, "Julian Harty" notifications@github.com wrote:

Closed #15 https://github.com/julianharty/app-store-reviews-app/issues/15.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/julianharty/app-store-reviews- app/issues/15#event-960181792, or mute the thread https://github.com/notifications/unsubscribe-auth/ AAZWiYzAGDewvTK19bFopkJqDNgA2ygJks5rcK3CgaJpZM4L-dCx .

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/julianharty/app-store-reviews-app/issues/15#issuecomment-280137505, or mute the thread https://github.com/notifications/unsubscribe-auth/AAv9477yRnmcBi4QWRgVvwSaPNCJYtIKks5rc2ewgaJpZM4L-dCx .

julianharty commented 7 years ago

Here's a useful, comprehensive example with tests. https://www.mkyong.com/java/how-to-read-and-parse-csv-file-in-java/

julianharty commented 7 years ago

And the official RFC https://tools.ietf.org/html/rfc4180

dakkad commented 7 years ago

https://google.github.io/guava/releases/snapshot/api/docs/com/google/common/base/Splitter.html

https://commons.apache.org/proper/commons-csv/