mageddo / javascript-csv

Automatically exported from code.google.com/p/jquery-csv
MIT License
1 stars 1 forks source link

add trim as a builtin call back #22

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
It would be very convenient to trim leading and trailing whitespace from each 
value as it is parsed.

Original issue reported on code.google.com by eric.the...@gmail.com on 20 Jan 2013 at 4:23

GoogleCodeExporter commented 8 years ago
That can be handled.

It won't be the default approach but I can add it in as an optional feature.

Just to make sure we're talking about the same thing, you mean:

    "this", "is", an, example

Should be interpreted as:

    "this","is",an,example

Technically, the whitespace is illegal but removing it shouldn't be too much of 
a problem.

Original comment by evanpla...@gmail.com on 25 Jan 2013 at 4:24

GoogleCodeExporter commented 8 years ago
[deleted comment]
GoogleCodeExporter commented 8 years ago
Any updates on this?

I was looking for trimming it myself (with row.join("\r").replace(/[ ]*\r[ 
]*/g,"\r").trim().split("\r") ) on each row (maybe also add other whitespace 
characters to the regex).

Note that it worked because I was sure my CSV didn't contain any "\r" 
characters.

But if trimming is coming soon, I don't have to do this.

Original comment by sander...@gmail.com on 6 May 2013 at 8:20

GoogleCodeExporter commented 8 years ago

Original comment by evanpla...@gmail.com on 9 Dec 2013 at 11:23

GoogleCodeExporter commented 8 years ago
I'm facing the same issue with whitespace, if there an update on this ?

Thanks !

Original comment by soteras....@gmail.com on 25 Feb 2014 at 2:30

GoogleCodeExporter commented 8 years ago
FWIW, I packaged this module for node and added a skipwhitespace option. Seems 
to work but i only tested it briefly. See 
https://github.com/olalonde/csv2array/commit/4f65b223928df7637f2df8148c764499b11
ee5d9#diff-168726dbe96b3ce427e7fedce31bb0bcR174

Original comment by olalo...@gmail.com on 20 Mar 2014 at 7:53

GoogleCodeExporter commented 8 years ago
To put it simply, GIGO (Garbage-In Garbage-Out).

Skipping whitespace by default will never be added into the core because of 
Rule #4 of the CSV spec.
"Rule #4 - Spaces are considered data and entries should not contain a trailing 
comma"

According to the spec, whitespace -- even whitespace not enclosed in quotes -- 
is considered valid data.

By trimming whitespace, the parser may be unintentionally destroying valid data 
for some users.

If you need to add an intermediate step to clean your input data then you'll 
need to do it via the preprocessor callback.

Original comment by evanpla...@gmail.com on 29 May 2014 at 3:37

GoogleCodeExporter commented 8 years ago
I realize that's probably not the answer you want to hear since it's common 
practice to use extra spaces as padding in hand-written CSV. 

Fortunately, trimming whitespace is easier than it sounds.

Just write a very simple preprocessor function that counts double-quotes. If a 
space is found outside a pair of double-quotes, don't push it to the output.

Then, just attach it to the onPreParse callback so it runs before the parser 
starts.

Original comment by evanpla...@gmail.com on 29 May 2014 at 3:43