Open coaperator opened 2 years ago
Hi, can you provide me with a link to that xml? (shared in drive or similar)
El lun, 16 may 2022 a las 11:48, Алекс @.***>) escribió:
Hello, I am trying to process a 100 megabyte xml file in streaming mode
const XmlReader = require('xml-reader')
const file = fs.createReadStream( 'public/prices/middle-price_list.yml', 'utf8' )
const parser = XmlReader.create({ stream: true })
file.on('data', (chunk) => { parser.parse(chunk) }) // Tried another option from the documentation // file.on('data', (chunk) => { // chunk.split('').forEach((char) => parser.parse(char)) // }) // file.split('').forEach((char) => parser.parse(char))
parser.on('tag:yml_catalog', (data) => { objShop.date = data.attributes.date })
parser.on('tag:shop', (data) => { objShop.name = data.children.filter( (el) => el.name == 'name' )[0]?.children[0]?.value
objShop.company = data.children.filter( (el) => el.name == 'company' )[0]?.children[0]?.value objShop.url = data.children.filter( (el) => el.name == 'url' )[0]?.children[0]?.value
})
parser.on('tag:offer', (data) => { objPrice.push({ title: data.children.filter((el) => el.name == 'name')[0]?.children[0] ?.value, url: data.children.filter((el) => el.name == 'url')[0]?.children[0] ?.value, picture: data.children.filter((el) => el.name == 'picture')[0] ?.children[0]?.value, }) })
parser.on('done', (data) => { console.log('objShop', objShop) console.log('objPrice', objPrice.length) })
But during processing, the RAM is occupied by more than 1.5 gigabytes what am I doing wrong?
— Reply to this email directly, view it on GitHub https://github.com/pladaria/xml-reader/issues/11, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAENOWM2DNZVXO6PSOQD7S3VKIKWDANCNFSM5WA4UWVQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Hi, can you provide me with a link to that xml? (shared in drive or similar) El lun, 16 may 2022 a las 11:48, Алекс @.***>) escribió:
I can’t give you exactly the one that I used, but you can take this code and iterate the offer block 80-100 thousand times in a loop to get a large file
<?xml version="1.0" encoding="utf-8" ?>
<!DOCTYPE yml_catalog SYSTEM "shops.dtd">
<yml_catalog date="2022-05-14 17:17">
<shop>
<name>Test</name>
<company>Test</company>
<url>https://test.com/</url>
<currencies>
<currency id="USD" rate="1"></currency>
</currencies>
<delivery-options>
<option cost="390" days="0-1"></option>
</delivery-options>
<categories>
<category id="1">Test category</category>
</categories>
<offers>
<offer id="832599" available="true">
<url>https://test.com/intel_core_i9-11900_832599.html</url>
<price>37677</price>
<currencyId>USD</currencyId>
<categoryId>1</categoryId>
<picture>https://img.test.com/82/b4/6065c6482b404261875729_500.jpg</picture>
<pickup>true</pickup>
<delivery>true</delivery>
<delivery-options>
<option cost="390" days="0" order-before="11"></option>
</delivery-options>
<name>Intel Core i9-11900 CM8070804488245 Rocket Lake 8C/16T 2.5-5.3GHz</name>
<vendor>Intel</vendor>
<model>Core i9-11900</model>
<vendorCode>CM8070804488245</vendorCode>
<description>Rocket Lake 8C/16T 2.5-5.3GHz (LGA1200, L3 16MB, 14nm, UHD Graphics 750 1.3GHz, 65W)</description>
<sales_notes>test</sales_notes>
<manufacturer_warranty>true</manufacturer_warranty>
<cpa>1</cpa>
<weight>0.1</weight>
</offer>
</offers>
</shop>
</yml_catalog>
Hi I'm dealing with the same issue. I have a 740 MB xml (confidential information) and I'm getting "FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory"
This is my parser code
I have checked byte size of every single variable but all of them are correctly garbage collected after each iteration. I've also commented out parts of the code in case something other than the parser was triggering the error but the code only runs when i comment out the entire parser, meaning parser.parse
I've tried looking at the source code of both the reader and the lexer but I can't seem to figure out what's causing the issue
Hello, I am trying to process a 100 megabyte (~ 900 thousand lines in a file) xml file in streaming mode
The file is processed normally, but during processing, the RAM is occupied by more than 1.5 gigabytes what am I doing wrong?