RovoMe / ts-edifact

Typescript port of the node-edifact project
Apache License 2.0
13 stars 9 forks source link

Differences between ts-edifact and edifact libraries #1

Closed Misery42 closed 3 years ago

Misery42 commented 4 years ago

I try to parse this document as in example from edifact JavaScript library but it does not work:

at Object.missingSegmentStart (node_modules/ts-edifact/src/main/validator.ts:416:20)

What is wrong?

let document: string = "";
document += "UNA:+.? \'";
document += "UNH+ME000001+IFTMIN:D:01B:UN:EAN004\'"
document += "BGM+610+569952+9\'";
document += "DTM+137:20020301:102\'";
document` += "DTM+2:200203081100:203\'";
document += "CNT+11:1\'";
document += "RFF+CU:TI1284\'";
document += "TDT+20++30+31\'";
document += "DTM+133:200203051100:203\'";
document += "LOC+9+5412345678908::9\'";
document += "NAD+CZ+5412345123453::9\'";
document += "NAD+CA+5411234512309::9\'";
document += "NAD+CN+5411234444402::9\'";
document += "NAD+DP+5412345145660::9\'";
document += "GID+1+1:09::9+14:PK\'";
document += "HAN+EAT::9\'";
document += "TMP+2+000:CEL\'";
document += "RNG+5+CEL:-5:5\'";
document += "MOA+44:45000:EUR\'";
document += "PIA+5+5410738377117:SRV\'";
document += "MEA+AAE+X7E+KGM:250\'";
document += "PCI+33E\'";
document += "GIN+BJ+354123450000000014\'";
document += "UNT+23+ME000001\'";
    const enc: string = 'UNOA';
    const doc: string = document;

    const validator: Validator = new ValidatorImpl();
    const parser: Parser = new Parser(validator);

    let result: ResultType[];
    let elements: string[][];
    let components: string[];

    result = [];
    elements = [];
    components = [];

    parser.onOpenSegment = (segment: string): void => {
        // Started a new segment.
        elements = [];
        result.push({ name: segment, elements: elements });
    };

    parser.onCloseSegment = () => { };

    parser.onElement = (): void => {
        // Parsed a new element.
        components = [];
        elements.push(components);
    };

    parser.onComponent = (value: string): void => {
        // Got a new component.
        components.push(value);
    };

    parser.encoding(enc);
    parser.write(doc);
    parser.end();
Misery42 commented 4 years ago

Ok i figured out some differences: In the original you can provide a initial valdiator. Here it works when i run instead new Parser()

And die EDIFACT file needs "UNA" for terminal symbols!!!

JavaScript Sample:

'use strict'

var edifact = require('edifact');

var validator = new edifact.Validator();
var parser = new edifact.Parser(validator);

validator.define(require('edifact/segments.js'));
validator.define(require('edifact/elements.js'));

var document = '';

document += 'UNB+UNOA:1+005435656:1+006415160:1+060515:1434+00000000000778\'';
document += 'UNH+00000000000117+INV\n\rOIC:D:97B:UN\'';
document += 'BGM+380+342459+9\'';
document += 'DTM+3:20060515:102\'';
document += 'RFF+ON:521052\'';
document += 'NAD+BY+792820524::16++CUMMINS MID-RANGE ENGINE PLANT\'';
document += 'NAD+SE+005435656::16++GENERAL WIDGET COMPANY\'';
document += 'CUX+1:USD\'';
document += 'LIN+1++157870:IN\'';
document += 'IMD+F++:::WIDGET\'';
document += 'QTY+47:1020:EA\'';
document += 'ALI+US\'';
document += 'MOA+203:1202.58\'';
document += 'PRI+INV:1.179\'';
document += 'LIN+2++157871:IN\'';
document += 'IMD+F++:::DIFFERENT WIDGET\'';
document += 'QTY+47:20:EA\'';
document += 'ALI+JP\'';
document += 'MOA+203:410\'';
document += 'PRI+INV:20.5\'';
document += 'UNS+S\'';
document += 'MOA+39:2137.58\'';
document += 'ALC+C+ABG\'';
document += 'MOA+8:525\'';
document += 'UNT+23+00000000000117\'';
document += 'UNZ+1+00000000000778\'';

var result;
var elements;
var components;

parser.on('opensegment', function (segment) {
    elements = [];
    result.push({ name: segment, elements: elements });
});

parser.on('closesegment', function () { });

parser.on('element', function () {
    components = [];
    elements.push(components);
});

parser.on('component', function (value) {
    components.push(value);
});

result = [];

parser.encoding('UNOA');
parser.write(document);
parser.end();

result;
RovoMe commented 4 years ago

@Misery42 Hi and thanks for the report. This project is more a less a 1:1 port, with minor tweaks from the node-edifact project. I'm quite new with typescript and the javascript environment in general, so chances are that I probably messed something up. I'm not sure if the current status should be used in production right now, TBH. I mainly ported it to typescript to get more familiar with it and to add customization to it more easily. I'm aware that you can patch in new features into existing libraries and the like, though as mentioned I'm rather new to this whole ecosystem. I initially also worked on just some type definitions for the node-edifact project, which seemed to work, though I ended up with starting the port as I probably will add some customization and hopefully fixes as well. The actual node-edifact project seems not in active development any further, so I hope I can bring a bit of fresh air to this domain.

I general, I prefer using the Reader class as here some default implementations on the respective parser methods are already in place and for most parsing tasks this may be enough I guess. I also added an encoding(string): void method to allow to specify the character set used in the document on the reader itself. In the node-edifact project this had to be done on calling the reader's parser and then set it on the parser actually.

In regards to the missing segment start failure, this usually indicates that the segment and/or element table are missing out on some segment/element definitions found within the parsed document. I recently tweaked the error messages a bit to give more insights on the actual problem at hand and in your case, with the provided document string, the library should now return an error message like

No segment definition found for segment name GID.

which states that the segmentTable does not yet define this segment. Further, TMP and RNG segments are also missing.

In regards to specifying the encoding, Edifact supports different character sets. In order for the parser to know which characters are admissible, it needs the respective encoding to use. Invalid characters found should lead to a rejection of the document with a hint on the position. I.e. currently the parsing fails for UNOA on the RNG+5+CEL:-5:5' segment as the parser does probably not recognize the - character correctly. While tests I've added to the parser.spec seem to advertise a correct handling of the minus/hyphen symbol, the first example indicates however some tokenizer issue I have yet to investigate

RovoMe commented 4 years ago

@Misery42 I've now pushed v0.0.4 which should include the missing segment- and element definitions as well as contain a fix for the negative numerical issue. I've also updated the README to showcase the usage of the Reader class which should be a bit easier to deal with.

Misery42 commented 4 years ago

Thank you, i will try it out later.