shaovoon / minicsv

C++ Minimalistic CSV Streams
MIT License
83 stars 26 forks source link

it is not possible to read multiline fields #3

Open bmonkey opened 7 years ago

bmonkey commented 7 years ago

Hi, it looks like you do not support multiline fields even when the field is en-quoted.

ei: 20170101,B,"28229 CR33 207C","Burg","FL","12345","USA",22

it happens because you use std::getline to read from a file and it reads the line until CRLF. having said that it looks like the following code in "get_delimited_str" will never be true:

    if (ch == '\r' || ch == '\n')
        break;

Do you plan to support multiline fields? If it works for you I can submit a pull request to you with my fix or will send details by email.

for the reference: from RFC4180:

  1. Fields containing line breaks (CRLF), double quotes, and commas should be enclosed in double-quotes. For example:

    "aaa","b CRLF bb","ccc" CRLF zzz,yyy,xxx

Thanks

Mathanraj-Sharma commented 11 months ago

@shaovoon is there any update on this? I have a row in my CSV, which has a multiline field, could you please tell what are the configs I should use to read it properly?

Example:

115,1.0,1,"6f12500a98f6636f784f98d20af81b17","2016-04-29 15:08:45","Beautiful Large Renovated One Bedroom Apt on a Tree Lined Block, Williamsburg
Apt Includes:

    Large Living Room
    Large Bedroom W/ Closet & Windows
    Large Eat-In Kitchen
    Full Bathroom
    Hardwood Floors
    Air Conditioner

Located on Powers St Between Bushwick Ave & Olive St
Close to the Graham Ave & Grand Ave L train Stations.

Call/Text/Email Brandon to schedule a showing today.

","263 Powers St","","medium",40.7133,6943997,-73.9395,"44f88993b340ce5a2518a7103aaa623c",2199,"263 Powers St",1478715624,"2016-11-09 18:20:24"
shaovoon commented 11 months ago

This is how minicsv solves it. You must use minicsv to write and read.

When minicsv encounters a multiline field, it will flatten it into a single line by escaping the newlines. When reading the same field, it will unescape the newline tokens and unflatten the field back to multiline.

Since your csv file already contains newlines. minicsv cannot do anything about it.