tdecaluwe / node-edifact

Javascript stream parser for UN/EDIFACT documents.
https://www.npmjs.com/package/edifact
Apache License 2.0
50 stars 13 forks source link

Error: Invalid character _ at position 29 #22

Closed mkjsix closed 7 years ago

mkjsix commented 7 years ago

I'm trying to parse a DATEX I EDIFACT message, but I receive the following error:

node_modules/edifact/parser.js:198
        throw Parser.errors.invalidCharacter(chunk.charAt(index), index);
        ^

Error: Invalid character _ at position 29
    at Object.Parser.errors.invalidCharacter (node_modules/edifact/parser.js:223:12)
    at EventEmitter.Parser.write (node_modules/edifact/parser.js:198:29)
    at Object.<anonymous> (/Users/mturatti/src/softinstigate/ART/datex1/main.js:12:8)
    at Module._compile (module.js:409:26)
    at Object.Module._extensions..js (module.js:416:10)
    at Module.load (module.js:343:32)
    at Function.Module._load (module.js:300:12)
    at Function.Module.runMain (module.js:441:10)
    at startup (node.js:139:18)
    at node.js:990:3

I'm using node v4.6.2. Here is my little main.js script:

var Parser = require('edifact/parser.js');
var Validator = require('edifact/validator.js');

var fs = require('fs');

var doc = fs.readFileSync('data/ITVIA714337.MSG', 'utf8');

var validator = new Validator();
var parser = new Parser(validator);

parser.encoding('UNOA');
parser.write(doc);
parser.end();

I suspect the parser doesn't like the underscore found in ART_T, in this line:

UNB+UNOC:3+ITVIA+ART_T+160215:0000+10714337

Unfortunately the source format is not under my control, so I need to parse these as they are.

tdecaluwe commented 7 years ago

The _ character is not accepted by the UNOA character set, you should use this instead:

parser.encoding('UNOC');

The parser doesn't automatically switch encodings when encountering an UNB segment, so if you want to handle multiple encodings, you should set the encoding in a segment listener.

mkjsix commented 7 years ago

Thank you @tdecaluwe