segmentio / xml-parser

simple non-compliant xml parser for nodejs
101 stars 36 forks source link

Doesn't seem to work with veeeery long XML #16

Open felixfbecker opened 9 years ago

felixfbecker commented 9 years ago

I have XML from an API like this:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<ftexport xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="http://api.festivalticker.de/xml/export4.xsd">
    <title>Export</title>
    <url>http://www.festivalticker.de/</url>
    <event>
        <id>35964</id>
        <datum>15.09.29</datum>
        <enddatum></enddatum>
        <begin>1494352800</begin>
        <name>Genetikk</name>
        <genre>Rap</genre>
        <location>Turbinenhalle</location>
        <zip>46047</zip>
        <city>Oberhausen</city>
        <street>Im Lipperfeld 23</street>
        <land>de</land>
        <descript><![CDATA[]]></descript>
        <artists>Genetikk</artists>
        <price>VVK 28.15 €</price>
        <website>http%3A%2F%2Fwww.genetikk.de%2F</website>
        <fturl>http://www.festivalticker.de/konzerte/genetikk_35964/</fturl>
        <timestamp>1426993206</timestamp>
        <what>Konzert</what>
    </event>
    <event>
        <id>356...

This list goes on with events for 26000 lines. However, after parsing, root.children contains only 3 elements (only the first event):

{ declaration: { attributes: { version: '1.0', encoding: 'UTF-8', standalone: 'yes' } },
  root:
   { name: 'ftexport',
     attributes:
      { 'xmlns:xsi': 'http://www.w3.org/2001/XMLSchema-instance',
        'xsi:noNamespaceSchemaLocation': 'http://api.festivalticker.de/xml/export4.xsd' },
     children:
      [ { name: 'title', attributes: {}, children: [], content: 'Export' },
        { name: 'url', attributes: {}, children: [], content: 'http://www.festivalticker.de/' },
        { name: 'event',
          attributes: {},
          children:
           [ { name: 'id', attributes: {}, children: [], content: '35964' },
             { name: 'datum', attributes: {}, children: [], content: '15.09.29' },
             { name: 'enddatum', attributes: {}, children: [], content: '' },
             { name: 'begin', attributes: {}, children: [], content: '1494352800' },
             { name: 'name', attributes: {}, children: [], content: 'Genetikk' },
             { name: 'genre', attributes: {}, children: [], content: 'Rap' },
             { name: 'location', attributes: {}, children: [], content: 'Turbinenhalle' },
             { name: 'zip', attributes: {}, children: [], content: '46047' },
             { name: 'city', attributes: {}, children: [], content: 'Oberhausen' },
             { name: 'street', attributes: {}, children: [], content: 'Im Lipperfeld 23' },
             { name: 'land', attributes: {}, children: [], content: 'de' },
             { name: 'descript', attributes: {}, children: [], content: '' } ],
          content: '' } ],
     content: '' } }

What could be the reason? Is it because the tags have the same name? Because the file is so long?

felixfbecker commented 9 years ago

Ok I just tested it with just 3 events, it's still only picking up the first