jindw / xmldom

A PURE JS W3C Standard based(XML DOM Level2 CORE) DOMParser and XMLSerializer.
Other
819 stars 265 forks source link

Out of memory error while parsing large string #252

Open ghost opened 5 years ago

ghost commented 5 years ago

When I'm trying to parse strings from 100+mb files I get out of memory error. And I don't get this error trying to parse files smaller than 100mb (approximately).

Code:

const fs = require('fs');
const DOMParser = new require('xmldom').DOMParser;

let fromFile = './file';
let toFile = './result.json';

console.log('Reading file...');
let file = fs.readFileSync(fromFile);

console.log('Converting file data to string...');
let fileString = file.toString();

console.log('Parsing file data...');
let data = new DOMParser().parseFromString(fileString); // I get the error here

Output:

D:\Dev\OpenSource\osm-to-geojson>node ./osmtojson.js
Reading file...
Converting file data to string...
Parsing file data...

<--- Last few GCs --->

[5048:000000000029D660]   128270 ms: Mark-sweep 1398.2 (1417.6) -> 1397.6 (1417.
6) MB, 3048.8 / 0.0 ms  (average mu = 0.177, current mu = 0.121) allocation fail
ure scavenge might not succeed
[5048:000000000029D660]   131798 ms: Mark-sweep 1398.3 (1417.6) -> 1397.8 (1418.
1) MB, 3379.0 / 0.0 ms  (average mu = 0.111, current mu = 0.042) allocation fail
ure scavenge might not succeed

<--- JS stacktrace --->

==== JS stack trace =========================================

    0: ExitFrame [pc: 00000369B69DC5C1]
    1: StubFrame [pc: 00000369B69CE100]
Security context: 0x03285af1e6e1 <JSObject>
    2: replace [000003285AF105E1](this=0x00faf4844a49 <String[6]: 120198>,0x00fa
f4844a71 <JSRegExp <String[7]: &#?\w+;>>,0x0392d8ad5b79 <JSFunction entityReplac
er (sfi = 00000392D8ADF681)>)
    3: parseElementStartPart [00000392D8AD5BD9] [D:\Dev\OpenSource\osm-to-geojso
n\node_modules\xmldom\sax.js:~213] [pc=00000369B...

FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaS
cript heap out of memory
 1: 000000013F9EECE5
 2: 000000013F9C8196
 3: 000000013F9C8BA0
 4: 000000013FC58D5E
 5: 000000013FC58C8F
 6: 00000001401969D4
 7: 000000014018D137
 8: 000000014018B6AC
 9: 0000000140194627
10: 00000001401946A6
11: 000000013FD37767
12: 000000013FDCF44A
13: 00000369B69DC5C1

Node version: 10.13.0 OS: Windows 7 64 bit

JonathanRowell commented 5 years ago

You don't need a large string - just something a bit complex like this :-

$ node test-xmldom.js > jr.txt
FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory
1: 01228A2E node::MakeCallback+3982
2: 0185BDE2 v8::internal::Heap::MaxHeapGrowingFactor+8146

marc21.xml.gz