agentejo / cockpit

Add content management functionality to any site - plug & play / headless / api-first CMS
http://getcockpit.com
MIT License
5.4k stars 523 forks source link

csv import fails if csv file has a new line at end of file #955

Closed raffaelj closed 4 years ago

raffaelj commented 5 years ago

When saving a spreadsheet with Open Office Calc as CSV file, it automatically adds a new line at the end of the file. Importing it to Cockpit works, but the last chunk - the rows are splitted into chunks of 20 - with a single empty string breaks importing the whole chunk and the status stops at e. g. 87%.

After deleting the last empty line, everything worked.

Steps to reproduce:

Create a collection with two fields "text" and "number" and import the following csv file:

"text","number"
"qwe",1
"qwe",2
"asd",3
"sxcf",4
"x",5
"cv",6
"xcvxcv",7
"hjh",8
"uzlkzukl",9
"h,",10
"gvvgc",11
"<",12
,13
"d",14
"<xyf<d ",15
"sd",16
"fdsy fdf glöiuerpw3894ß0q9 845",17
"aölk fjp49ur oöitgj dfg",18
"0934u05 9ruüaödf j",19
"ß0908p45orhsöl kkj",20
"Q)(„§!/=%§/",21
">doj<dfj>>öash ö ",22
raffaelj commented 4 years ago

I can't reproduce the breaking behavior anymore. It imports the data and I have an empty data set at the end if I don't delete the last line break before importing the csv.

After doing some tests and reading through PapaParse source and issues, I discovered an option skipEmptyLines. Setting it to true skips the last empty row, but it would also skip empty entries in the middle of the data set. Setting it to "greedy" would skip empty strings "", too. So this setting is no option, because empty data sets might be the expected behavior.

There is no real standard about the trailing line break.

https://tools.ietf.org/html/rfc4180 The last record in the file may or may not have an ending line break.

So it might be useful in a future release to pass user defined options to the parser via ui settings. In the meantime, I use json files I remove the last line break before importing.

Since I can't reproduce the failed import, I close this issue.