aaronhuggins / node-x12

ASC X12 parser, generator, query engine, and mapper; now with support for streams.
https://aaronhuggins.github.io/node-x12/
MIT License
49 stars 14 forks source link

TypeError: Cannot read property 'range' of undefined when processing lots of EDI files in parallel #36

Open pomSense opened 2 years ago

pomSense commented 2 years ago

Description

Hey @aaronhuggins, first of all, amazing package. I've been following this and the progress of js-edi, and currently using node-x12 in production.

I was upgrading our current implementation of node-x12 to use a stream processor where we continuously stream process edi files received over a file server.

I am running into this strange error and was trying to see if you had any idea. When I am running 5 edi files, works fine. Then I double that to 10, using the same files, still good. I continue doubling to check the performance of the same files, and I start getting strange behavior after 30 files. Sometimes it works up to 150 files, but then getting this error randomly:

 currentElement.range.end = new Positioning_1.Position(l, c - 1);
                               ^
TypeError: Cannot read property 'range' of undefined
    at X12Parser._parseSegments (/Users/pom/Code/chaine/packages/pi-esg-ftp/.webpack/service/src/ingress-trigger/test-trigger.js:252310:32)
    at X12Parser._consumeChunk (/Users/pom/Code/chaine/packages/pi-esg-ftp/.webpack/service/src/ingress-trigger/test-trigger.js:252512:43)
    at X12Parser._transform (/Users/pom/Code/chaine/packages/pi-esg-ftp/.webpack/service/src/ingress-trigger/test-trigger.js:252539:14)
      at X12Parser.Transform._read (internal/streams/transform.js:205:10)
      at X12Parser.Transform._write (internal/streams/transform.js:193:12)

From the node_modules, it is this exact line:

            else if (edi[i] === segmentTerminator) {
             //  👇 trying to access currentElement but it is undefined
                currentElement.range.end = new Positioning_1.Position(l, c - 1); 
                currentSegment.elements.push(currentElement);
                if (currentSegment.tag === 'IEA' && currentSegment.elements.length === 2) {
                    currentSegment.elements[1].value = `${parseInt(currentSegment.elements[1].value, 10)}`;
                }
                currentSegment.range.end = new Positioning_1.Position(l, c);
                segments.push(currentSegment);
                currentSegment = new X12Segment_1.X12Segment();
                tagged = false;
                if (segmentTerminator === '\n') {
                    l++;
                    c = -1;
                }
                // element delimiter
            }

Looking at the source code, it seems currentSegment.range is throwing in _parseSegments, but doesn't make sense since it is instantiating new X12Segment() and within it new Range().

I was certain it is due to something I am doing wrong, but wanted to get your take on it.

Here is how the stream processor is implemented using Highland:

import _ from 'highland'

// takes an array of strings with filePaths on a file server
const ediProcessor = async(ediFilePaths: string[]) =>{
return _(ediFilePaths)
.through(getFile)
.through(convertEDIToObject({ parallel: 100 }))
.sequence()
.through(publishToEventBus)
}

Functions

import _ from 'highland'
import { X12Parser } from 'node-x12'

// Publishes an event with the result from the parser to an event bus
// Consumes the stream
const publishToEventBus = async()=> {
... // removed for brevity
}

// Gets the edi file from a file server, and returns a nodejs ServerResponse, in which I am returning only the IncomingMessage
const getFile = (filePath: string): IncomingMessage {
... // removed for brevity
}

const convertEDIToObject = ({
  parallel = 8
}) => {

  const parseEDI = (file: IncomingMessage) => {
    const parser = new X12Parser()
    const promise = new Promise((resolve, reject) => {
        _(file.Body)
          .through(parser)
          .toArray((data) => {
            // toArray consumes a stream
            let interchange: any = parser.getInterchangeFromSegments(data)

           // In some production EDI files, some people were sending blank new lines on the first line which was causing interchange.toJSON() to throw
            if (!interchange) return reject(_([{error: true, message: 'Bad EDI file format' }]))
            const {
              header,
              functionalGroups,
              options
            }: { header: X12Segment; functionalGroups: X12FunctionalGroup[]; options: X12SerializationOptions } =
              interchange.toJSON()

            const result = functionalGroups
              .map((group: X12FunctionalGroup) => {
                const functionalHeader = group.header
                return group.transactions.map((transaction: X12Transaction) => {
                  return {
                    result: {
                      transactionHeader: header,
                      functionalHeader: functionalHeader,
                      transaction,
                      options,
                      ediID: (transaction as any).header[0]
                    }
                  }
                })
              })
              .flat()
            resolve(_(result))
          })

    })
      .then((result) => {
        return result
      })
      .catch(e => console.log(e))
    return _(promise)
  }
  return (s: any) => s.map(parseEDI).parallel(parallel)
}

I've been beating my head at this for few days now and just wanted to see if you noticed anything off, or is there something with the X12Parser? Any help would be appreciated!

Thanks again so much for making this package!

aaronhuggins commented 2 years ago

First of all, sorry for the very late reply. X12 has ended up low priority due to some unexpected life circumstance stuff.

Second, this is REALLY weird for a bug.

I'm going to take a close look at the parser and see what's going on.

aaronhuggins commented 2 years ago

@pomSense Do you have an example of the dataset that's triggering the error? The parser is expecting to encounter an element delimiter to instantiate currentElement; if no delimiter is encountered, or the segemnt is malformed in some other way, then currentElement won't get assigned. This might explain the bug.

Regardless, I'm going to give this project a little love over the next month. js-edi is not mature enough and I'd like to get an X12 parser working on Deno.

aaronhuggins commented 2 years ago

Hey, @pomSense! Your issue might be related to this PR: #26

All code from the PR should be in the beta: npm install node-x12@beta

Please give that a try and let me know if it fixes your issue.

https://www.npmjs.com/package/node-x12/v/1.7.0-beta2

aaronhuggins commented 2 years ago

@pomSense The Deno migration may have uncovered the source of the bug you are reporting.

I think I need to change https://github.com/aaronhuggins/node-x12/blob/a1d2515faca8ae98c5a7d131f03fc04b43c33865/src/X12Parser.ts#L202-L205

to definitely assign currentElement, like:

    let currentSegment: X12Segment = new X12Segment()
    let currentElement: X12Element = new X12Element()

Looking at the original x12 codebase, I believe this would be as old as the library before I took it over.

Over the next week, I'll be further along enough to say for sure. If it gets fixed in the Deno migration effort, I'll back-port it to the 1.7.x release and publish before I merge the Deno changes.

pomSense commented 2 years ago

First of all, sorry for the very late reply. X12 has ended up low priority due to some unexpected life circumstance stuff.

Second, this is REALLY weird for a bug.

I'm going to take a close look at the parser and see what's going on.

Hey no problem man, life happens!

@pomSense Do you have an example of the dataset that's triggering the error?

The data was using the following 4 files repeated again and again up to 150+ times. These are variations of actual prod data (with sensitive info changed) that I was trying to ramp up to test the performance of the aforementioned stream processor.

Please give that a try and let me know if it fixes your issue.

https://www.npmjs.com/package/node-x12/v/1.7.0-beta2

Do you still want me to give this a try or wait for port back to 1.7.x?

Let me know and I'll try what you need to see if it works. Thanks again @aaronhuggins

aaronhuggins commented 2 years ago

Hey @pomSense! Got my attention while I'm kid-free for a few minutes!

Do you still want me to give this a try or wait for port back to 1.7.x?

I've decided not to backport, because there were a few bugs I caught during migration due to some of Deno's stricter TypeScript defaults. It was a little unwieldy trying to keep track of the changes for each file so that I could cherry-pick.

If you're up for it, I'm literally 5 minutes away from releasing another prerelease 1.8.0-1. I'd like it if you gave that a try to see if it fixes your issue.

Or maybe it blows up something for you! I'm not 100% sure yet. I'd like to get my test coverage a little higher before an official minor version bump.

aaronhuggins commented 2 years ago

Build action is running... image

aaronhuggins commented 2 years ago

Pre-released! image

pomSense commented 2 years ago

Hey @pomSense! Got my attention while I'm kid-free for a few minutes!

Do you still want me to give this a try or wait for port back to 1.7.x?

I've decided not to backport, because there were a few bugs I caught during migration due to some of Deno's stricter TypeScript defaults. It was a little unwieldy trying to keep track of the changes for each file so that I could cherry-pick.

If you're up for it, I'm literally 5 minutes away from releasing another prerelease 1.8.0-1. I'd like it if you gave that a try to see if it fixes your issue.

Or maybe it blows up something for you! I'm not 100% sure yet. I'd like to get my test coverage a little higher before an official minor version bump.

I'll give this a try later tonight. Thanks so much for getting this out. I'll run few scenarios as well