leerssej / google-refine

Automatically exported from code.google.com/p/google-refine
Other
0 stars 0 forks source link

Can't create project from big JSON file #231

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Launch Refine
2. In "Create a New Project" choose a big (29,4MB in my case) JSON file as data 
file.
3. Click "Create Project"

What is the expected output? 
A new project with imported data. It works fine with smaller data sets.
What do you see instead?
Nothing. The browser does nothing but seems busy. 

What version of the product are you using? 
Version 2.0 [r1836]

On what operating system?
OSX 10.6

Please provide any additional information below.
The structure of  the file is:

{"total_rows":200885,"offset":0,"rows":[
{"id":"36ae432498bd38e7fe0d319b7902ffa4","key":[2010,9,15],"value":{"service":"c
s","segment":"sms","app":"fc","lang":"en","device":"ipad"}},
/* ... plus lot of rows */
{"id":"36ae432498bd38e7fe0d319b79030730","key":[2010,9,17],"value":{"service":"c
s","segment":"phone","app":"fc","lang":"en","device":"iphone4"}}
]}

I've tried increasing the JVM memory as per your instructions. In that case 
it's worst because the whole system slows down.

Original issue reported on code.google.com by julian.r...@gmail.com on 17 Nov 2010 at 3:47

GoogleCodeExporter commented 9 years ago

Original comment by iainsproat on 17 Nov 2010 at 3:51

GoogleCodeExporter commented 9 years ago
How much physical memory do you have on your system?  You need to have enough 
to hold the entire database in memory, plus leave some room left over for 
Refine code, system processes, I/O buffers, etc.

I haven't looked at how the JSON importer is structured, but you may need X * 
original file size + Y * result data size, which both X & Y are > 1.  ie for a 
30 MB json file resulting in, say, 15 MB of data, I could easily see you 
needing 100 MB or more.  Having said that, 100 MB isn't very much these days 
and if it's actually needing more than 500 MB or so, then something definitely 
needs optimization.

Original comment by tfmorris on 17 Nov 2010 at 6:05

GoogleCodeExporter commented 9 years ago
3 GB of physical memory. Even tried with all programs closed.

Original comment by julian.r...@gmail.com on 17 Nov 2010 at 6:15

GoogleCodeExporter commented 9 years ago
That should be plenty for a data file of this size.  How big did you make your 
heap?  I'd increase it to at least 1 GB (-Xmx1024m) and you can probably go to 
1.5 or 2 GB without crowding the rest of your system.

Any chance you can provide the full file to test with?

Original comment by tfmorris on 17 Nov 2010 at 6:25