fjcapelo / jquery-csv

Automatically exported from code.google.com/p/jquery-csv
MIT License
0 stars 0 forks source link

Enhancement: Parse CSV in a thread using settimeout #33

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
Parsing huge csv file triggers the browser's alert mechanism to stop the 
script. 

I am trying to see where I can make changes so I can parse few thousand rows 
and then use settimeout and resume the remaining data parsing. Has anyone done 
this, any pointers how it can be done?

Thanks

Original issue reported on code.google.com by infocap...@gmail.com on 14 Nov 2013 at 7:47

GoogleCodeExporter commented 9 years ago
It depends how you define 'huge'.

The limit will be based on Javascript's memory limit to store the input string 
in memory. Typically, the JS memory limit is only a few megabytes so it's no 
surprise that a larger data set will exceed that limit.

The only workaround I can think of involves loading the CSV data in chunks. The 
hard part about that is deciding where to put the chunk boundary. Ideally, the 
boundary will be at the end of a record (ie a newline char that isn't escaped 
by quotes).

It should be relatively easy to create a very basic pre-parser that counts 
quotes and finds the character position of the last newline before the parser 
hits a pre-determined chunk boundary.

Counting the number of characters the parser has processed is a little 
trickier. There's a state variable that can be used to see how many entries and 
rows have been parsed but the nature of the regex lexer makes it difficult to 
track on a character-by-character basis.

You'll also have to figure out how to partially load the dataset using HTML5. 
In theory, HTML5 should have the facilities to do so but I have never tried.

If you make any progress on this, don't hesitate to share. Feel free to ask 
lots of questions if you attempt to create a solution. 

Original comment by evanpla...@gmail.com on 21 Nov 2013 at 1:32

GoogleCodeExporter commented 9 years ago
For e.g I tried loading a 12mb file, it was loaded fine but the CSV parsing 
took a little longer as expected. This triggered the browsers message "do you 
wish to continue or stop the script". After few "continue" it eventually 
finished.

I actually tried doing the settimeout but was not sure if the regex is 
maintaining the state after coming back from timeout. I will try counting the 
quotes to determine end of row. I did not think of this so gives some food for 
thought.

Will definitely share once I find a solution.

Original comment by infocap...@gmail.com on 21 Nov 2013 at 3:28

GoogleCodeExporter commented 9 years ago

Original comment by evanpla...@gmail.com on 9 Dec 2013 at 11:33