beenex / jquery-csv

Automatically exported from code.google.com/p/jquery-csv
MIT License
0 stars 0 forks source link

Firefox aint working properly #5

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. $.get('pres_data.csv'), function(data) { // file: http://d.pr/f/nQI7
2.var presData = $.csv2Array(data);

What is the expected output? What do you see instead?
an array with alot of numbers

What version of the product are you using? On what operating system?
0.6

Please provide any additional information below.
works in Chorme and Safari but i Firefox the array is all emtpy :(

console.log(data) seem fine i FF but console.log(presData) is all empty. agin 
only in FF

Original issue reported on code.google.com by thomasca...@gmail.com on 2 Jul 2012 at 12:20

GoogleCodeExporter commented 9 years ago
@thomascarlsen86

Thank you for filing the bug. From what I know, the bug is caused by Firefox 
not treating the regex global flag the same as the rest of the browsers do.

More info can be found on the issue in the following link:
http://stackoverflow.com/q/4178970/290340

As soon as I come up with a fix I'll incorporate it into the next release.

Original comment by evanpla...@gmail.com on 2 Jul 2012 at 12:49

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
thanks for the quick reply, and sad to hear that this is't a quick fix :(

Original comment by thomasca...@gmail.com on 2 Jul 2012 at 12:55

GoogleCodeExporter commented 9 years ago
Man, I totally made this code work last night perfectly and beautifully in 
Chrome, and then this morning opened it up in Firefox and ran into exactly this 
issue. Noooooo!

http://blog.thatscaptaintoyou.com/strange-behavior-of-the-global-regex-flag/ 
might have help you. It does say "The solution to this dilemma is painfully 
simple, if the regex is global just set the “lastIndex” property to zero 
before you test.". I would do it myself, but I haven't dug into your code deep 
enough to know where to fiddle.

Original comment by kou...@gmail.com on 4 Jul 2012 at 8:30

GoogleCodeExporter commented 9 years ago
The problem is showing up for me with this CSV file. It seems that adding 
quotes (as in the first two entries there) makes firefox behave. Strange that 
the other browsers work fine even on the non-quoted ones, though meta.delimiter 
is defaulting to """.

Original comment by kou...@gmail.com on 4 Jul 2012 at 8:42

Attachments:

GoogleCodeExporter commented 9 years ago
@kousue The code is actually very simple, csv2Array and csv2Dictionary handle 
the split by line then each line is passed to csvEntry2Array which handle the 
more complex CSV parsing. csvEntry2Array is where you'll find the regexes that 
need to be modified.

If you want to submit a patch, I'll get it incorporated ASAP. Otherwise, life 
has me busy and it'll be a couple of weeks before I can get a fix working, 
tested, and released.

If you decide to clone the source you'll also find a working test runner with 
directions. I'll drop a minor release (0.61) now that has the latest.

As for the delimiter defaulting to a double quote, that's should be correct 
according to the IETF RFC for CSV files. If you use something else (ex 
backslash) it can be changed in the function call parameters.

I'm not sure why quoting all values works in Firefox, maybe that'll work as a 
temporary solution until the bug is resolved.

Original comment by evanpla...@gmail.com on 4 Jul 2012 at 10:06

GoogleCodeExporter commented 9 years ago
BTW, the parameter to change the escape char is called 'escaper' and will be 
available in release 0.61.

Original comment by evanpla...@gmail.com on 4 Jul 2012 at 10:58

GoogleCodeExporter commented 9 years ago

Original comment by evanpla...@gmail.com on 4 Jul 2012 at 10:58

GoogleCodeExporter commented 9 years ago
I spent some time messing with the code. Everywhere where a global flag could 
be a problem has been resolved. I couldn't make it more global flag proof if I 
tried.

After staring cross-eyed at the test results for a while, it appears that only 
the fields that aren't delimited don't get matched (probably could have saved 
some time by reading the comments more closely). The even stranger part is, the 
data IS passing the validation step (else it would return null for non-matches) 
it's just the str.replace matching that is broken.

I'm going to try implementing the matching using the more robust RegExp.exec 
method next. Until this issue is fixed, you'll need to have your CSV data 
delimited for the parser to work in Firefox.

If you want to help, clone the source and take a look at the code or run the 
test runner in some other browsers (ie, Safari, IE, Opera, etc) and post your 
results.

Original comment by evanpla...@gmail.com on 5 Jul 2012 at 1:28

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
I have done this for a quick hack: 
$.csvEntry2Array = function(csv, meta) {
  // csv hack for FF
  csv = '"'+csv.replace(/,/g,'","')+'"';
.......

Original comment by thomasca...@gmail.com on 9 Jul 2012 at 12:15

GoogleCodeExporter commented 9 years ago
Any chance this will work with mixed delimited/non-delimited values in FF?  
Unfortunately I don't have the luxury of changing the source CSV I'm working 
with (it gets automatically updated).

Original comment by ee...@eecue.com on 20 Jul 2012 at 6:26

GoogleCodeExporter commented 9 years ago
My quick hack will work with a mixed csv, but I still don't think its a pretty 
solution, but it will work out for you ;)

Original comment by thomasca...@gmail.com on 20 Jul 2012 at 6:29

GoogleCodeExporter commented 9 years ago
Actually nope, it doesn't work because the delimited fields have commas inside 
of them:  

id, "delimited, field, here", other field, another field, "one more delimited 
field", etc

Original comment by ee...@eecue.com on 20 Jul 2012 at 6:34

GoogleCodeExporter commented 9 years ago
okay, this ain't prettier but you can do this:

$.csvEntry2Array = function(csv, meta) {
  // csv hack for FF
  csv = "'"+csv.replace(/"/g,"'").replace(/,/g,"','")+"'";

and then set your "delimiter" to delimiter:"'", at the top og this file

Original comment by thomasca...@gmail.com on 20 Jul 2012 at 6:41

GoogleCodeExporter commented 9 years ago
That doesn't work, probably related to the random ' characters in the CSV.

Original comment by ee...@eecue.com on 20 Jul 2012 at 6:46

GoogleCodeExporter commented 9 years ago
I have to say that you have a realy f**ked up csv file m8. :)

Original comment by thomasca...@gmail.com on 20 Jul 2012 at 6:49

GoogleCodeExporter commented 9 years ago
Any progress on this? I tried the above suggestions, but I don't think I did it 
right. They didn't help. I don't have, right now, a funky csv file. 

Original comment by Chad.R.B...@gmail.com on 13 Aug 2012 at 9:50

GoogleCodeExporter commented 9 years ago
@Chad.R.Bernier None yet, my personal life has me occupied. I'll try to take 
another shot at it over the next few days.

Original comment by evanpla...@gmail.com on 16 Aug 2012 at 3:45

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
In the callback function for String.replace, it appears that whenever a match 
is not found for a parenthesized submatch string, Firefox (and IE <= 8) 
represents the result as an empty string, whereas Chrome (and IE 9+) uses 
undefined. Run this quick test in your browser's console to confirm this 
behavior (the regular expression rex uses the relevant part of reValue, 
assuming a separator and delimiter of "):

    var rex = /"([^""]*(?:""[^""]*)*)"|([^,""\s]*(?:\s+[^,""\s]+)*)/,
        test1 = '"Hello!"',
        test2 = 'Hello!';

    function replaceCallback(m0, m1, m2) {
        console.log('matched substring (m0): ' + m0 + '\n'
                +   'submatch 1 (m1): \n'
                +       '\t type: ' + (typeof m1) + '\n'
                +       '\t as string: ' + m1 + '\n'
                +   'submatch 2 (m2): \n'
                +       '\t type: ' + (typeof m2) + '\n'
                +       '\t as string: ' + m2 + '\n');

        return '';
    }

    test1.replace(rex, replaceCallback); // m2 is empty string in FF, undefined in Chrome
    test2.replace(rex, replaceCallback); // m1 is empty string in FF, undefined in Chrome

In jQuery-CSV, this difference in behavior causes undesirable results in 
Firefox, due to the conditions on lines 88 and 93 of version 0.61 which expect 
a value of "undefined" for submatches that are not found. Here is revision to 
those two conditions that accounts for both undefined and empty string 
possibilities:

    csv.replace(reValue, function(m0, m1, m2) {
      // Remove backslash from any delimiters in the value
      if(typeof m1 === 'string' && m1.length) {        // Fix: evaluates to false for both empty strings and undefined
        var reDelimiterUnescape = /ED/g;
        reDelimiterUnescape = RegExp(reDelimiterUnescape.source.replace(/E/, escaper), 'g');
        reDelimiterUnescape = RegExp(reDelimiterUnescape.source.replace(/D/, delimiter), 'g');
        output.push(m1.replace(reDelimiterUnescape, delimiter));
      } else if(typeof m2 === 'string' && m2.length) { // Fix: evaluates to false for both empty strings and undefined
        output.push(m2);
      }
      return '';
    });

With these changes in place, the plugin is working perfectly for my needs on 
all tested browsers :D. I hope this addresses the Firefox issues that some of 
you encountered.

Original comment by skarrm...@gmail.com on 27 Aug 2012 at 10:01

GoogleCodeExporter commented 9 years ago
Comment #21 is probably a good lead to understanding the problem, but it breaks 
Chrome, as shown by the test suite.

In particular, empty strings are perfectly legitimate values, and need to be 
handled.

Original comment by r...@acm.org on 4 Sep 2012 at 2:49

GoogleCodeExporter commented 9 years ago
Ah. It passes the test if you accept #21's first change, and not his second.

The problem boils down to -- if the input really IS the null string value, you 
don't want to reject it from both clauses.

But you also want to add a:
  if (csv === "") {
      return [""];
  }

before the validity check that returns null, as an empty string is logically 
one empty value (just as a comma is logically two empty values). This appears 
to be consistent with the ABNF in the RFC.

You can also get rid of 5 constructions of RegExp's along the way -- you don't 
need to construct each intermediate partially-substituted RegExp. Perhaps you 
did it this way to make bugs show up better? I don't know if constructing a 
RegExp is expensive enough to worry about or not, in the grand scheme of 
things. It does work to remove them.

I constructed a bunch more test cases, with finer granularity, using jasmine, 
when I imported this into my system. Those were helpful.

The git repository does not appear to be up-to-date!

But I'm attaching patches from what was distributed as jquery.csv-0.61.js

0001 fixes some lint
0002 fixes the bugs
0003 removes the extraneous RegExps.

The patches are independent; you can apply any subset in any order.

Also, I notice that the existing test cases don't cover embedded quotes in 
strings; that's a case that ought to be covered. I'll add it to my suite, and 
report it as a separate issue if it fails.

Original comment by r...@acm.org on 4 Sep 2012 at 9:19

Attachments:

GoogleCodeExporter commented 9 years ago
For your patching convenience -- I reconstructed in my clone of your public 
repository, what should be the missing push, so I could construct patches you 
should be able to apply directly.

We'll see what kind of chaos I git when I eventually pull the missing commit. :)

I also added an additional patch. This one removes the Unicode Byte-Order-Mark 
(BOM) from the start of the file. This confuses some tools when jsmin goes and 
sticks it in the middle of the file, which is deprecated in Unicode 3.2 (and 
completely defeats any purpose it might have had!).

I made it separate in case you really have a need for a BOM -- or if your tools 
will just stick it back anyway.

You might need to supply --ignore-space-change to 'git am' to apply these, but 
they should apply OK on top of your 0.61 release point.

I'm removing the patches from my previous comment to avoid confusion. Note that 
the numbering has changed:

0001 removes BOM
0002 fixes some lint
0003 fixes the bugs
0004 removes the extraneous RegExps.

Original comment by r...@acm.org on 4 Sep 2012 at 8:59

Attachments:

GoogleCodeExporter commented 9 years ago
@rwk@acm.org Thank you for your contribution. You're right that I didn't push 
the last changes. I had some issues on my dev VM when Debian pulled in Gnome 3. 
The release was stuck on 0.60 even though the src said 0.51. Everything is 
fixed now.

Nice catch on the BOM issue. I currently don't run tests on the minified src, 
I'll add that to the TODO list. Looks like I need a need to get Notepad++ 
running under WINE too. I really wish somebody would fork and port NPP to *nix.

Not too clear about 0004. You use 3 different types of replace on the regex 
source. Is there a performance reason for doing it that way.

I think comment #14 might still need to be addressed. We'll see.

I'll post an update as soon as I have the changes pushed to the remote origin.

Original comment by evanpla...@gmail.com on 5 Sep 2012 at 6:34

GoogleCodeExporter commented 9 years ago
@rwk@acm.org I'm going to hold off on 0004 for now. First, because I want to 
get this bugfix texted and verified first. Second, because I had something else 
in mind to address the performance issues. See bug 6 for more details.

Original comment by evanpla...@gmail.com on 5 Sep 2012 at 5:23

GoogleCodeExporter commented 9 years ago
Thank you to everybody who participated in this bugfix. 

If you have the time, please run your data through the parser to verify that 
there aren't any other issues.

If nobody posts any other problems, I'd like to mark this bug as 'verified' and 
move on.

Original comment by evanpla...@gmail.com on 5 Sep 2012 at 5:52

GoogleCodeExporter commented 9 years ago
[I didn't hit Submit on this, so this reply is appearing out-of-sequence. Since 
you're heading in another direction anyway, it's just FYI...]

Re: 0004 -- no, just one kind of replace. There are just three contexts.

In the first case, you're pulling it from a supplied RegExp:

reValue = reValue.source.replace(/X/g, separator);

In the second case, reValue is now a string, so there's no .source. We just 
replace.
reValue = reValue.replace(/Y/g, delimiter);

In the third case, we again have a string, but we want a RegExp, so we re-wrap 
our final string in a RegExp again:

reValue = RegExp(reValue.replace(/Z/g, escaper), 'g');

Perhaps it would be more clear as:

var reValueStr = reValue.source;
reValueStr = reValueStr.replace(/X/g, separator);
reValueStr = reValueStr.replace(/Y/g, delimiter);
reValueStr = reValueStr.replace(/Z/g, escaper);
reValue = RegExp(reValueStr, 'g');

BTW, I tested with fields with commas; they work fine with these changes.

Original comment by r...@acm.org on 5 Sep 2012 at 11:35

GoogleCodeExporter commented 9 years ago
Gotcha, I prefer the second form.

I'll add that last change, run the tests, and if everything's good, drop 0.62 
tonight.

I want to get this bug fix released asap before anybody else downloads the 0.61 
source.

Nice work.

Original comment by evanpla...@gmail.com on 6 Sep 2012 at 12:31

GoogleCodeExporter commented 9 years ago

Original comment by evanpla...@gmail.com on 11 Oct 2012 at 4:07

GoogleCodeExporter commented 9 years ago

Original comment by evanpla...@gmail.com on 15 Oct 2012 at 10:39