assistunion / xml-stream

XML stream parser based on Expat. Made for Node.
MIT License
318 stars 107 forks source link

EINVAL, Incomplete character sequence #14

Open seanhess opened 12 years ago

seanhess commented 12 years ago

When parsing the below data, the first <p> node parses, but the second fails with the following error. I can't tell what the difference between the two nodes is. The library looks great otherwise!

/Users/seanhess/itv/telus/node_modules/xml-stream/lib/xml-stream.js:478
      data = self._encoder.convert(data);
                           ^
Error: EINVAL, Incomplete character sequence.
    at /Users/seanhess/itv/telus/node_modules/xml-stream/lib/xml-stream.js:478:28
    at [object Object].<anonymous> (/Users/seanhess/itv/telus/node_modules/xml-stream/lib/xml-stream.js:488:7)
    at [object Object].emit (events.js:67:17)
    at [object Object]._emitData (fs.js:1155:10)
    at afterRead (fs.js:1137:10)
    at Object.wrapper [as oncomplete] (fs.js:254:17)

<p id="391492" t="Rumeurs" rt="Rumeurs" d="Esther se réconcilie avec Vincent, mais elle se demande si elle a pris la bonne décision." rd="Esther se réconcilie avec Vincent, 
mais elle se demande si elle a pris la bonne décision." et="Ma vie est un téléroman">
<f id="2"/>
<k id="1" v="532931"/>
<k id="2" v="19"/>
<k id="6" v="20030409"/>
<k id="10" v="Program"/>
<c id="403"/></p>
<p id="391493" t="Rumeurs" rt="Rumeurs" d="Esther a décidé de faire une pause dans sa relation avec Vincent; Clara accepte de poser nue pour Charles." rd="Esther fait une pause dans sa relation; Clara accepte de poser nue." et="Ma vie est un téléroman">
<f id="2"/>
<k id="1" v="532931"/>
<k id="2" v="20"/>
<k id="6" v="20030416"/>
<k id="10" v="Program"/>
<c id="403"/></p>
Artazor commented 12 years ago

Hi, I'm afraid that this is due to limitations of node_iconv. Now it doesn't support for streamimg stateful encodings AFAIK. You can try to encourage Ben Noordhuis to add this feature to node_iconv: https://github.com/bnoordhuis/node-iconv/issues/7

punund commented 11 years ago

I have opened a pull request which fixes unneeded invocation of node.iconv, so if your document is in UTF-8 this might work.

AdrianTudC commented 9 years ago

@punund I still encounter this issue, it has been solved ?