Parser does not return null value for some data instead returns value from the next segment

Mohmedvaid commented 3 years ago

Thank you for creating an awesome library. Its really great.

I am facing a little issue when parsing the following file. This file contains total 4 PO1 lines(separated by line break for better visual) I am trying to map the data from edi to a javascript object. Now, the third PO1 line does not have PID and Author name. Is there a way I can make parser return null value for that line?

My code---- const ediStream = fs.readFileSync(${__dirname}/${fileName}, 'utf8') const interchange = parser.parse(ediStream)

{ author: FOREACH(PO1)=>PO1-PO109:PO103['UN']}

interchange.functionalGroups.forEach(group => { group.transactions.forEach(transaction => { items.push(transaction.toObject(mapItems)) }) })

Expected results: [ { product_id: '9780446360265', author: 'NEUFELDT, VICTORIA' }, { product_id: '9780767928427', author: 'CLAPTON, ERIC' }, { product_id: '9780099599531', author: 'null' }, { product_id: '9780446360272', author: 'NOT APPLICABLE' } ]

Actual Results: [ { product_id: '9780446360265', author: 'NEUFELDT, VICTORIA' }, { product_id: '9780767928427', author: 'CLAPTON, ERIC' }, { product_id: '9780099599531', author: 'NOT APPLICABLE' }, { product_id: '9780446360272', author: 'FOREACH(PO1)=>PO109' } ]

File:

ISA00 00 ZZ1556150 ZZ123MILL 2010151534U004010000000050P>~ GSPR1556150123MILL2020101515345X004010~ ST8550005~ BAK00ADPOTEST111220201015**60169126*20201015~ CURBTUSD~ REFZZZORDER PROCESSED~ CSH**123TEST 0001~ N1STEDI TEST ACCOUNT-SUFFIX15123TEST 0001~ N3REGULAR EDI ACCOUNTATTN ACQ SERVS~ N37615 DISALLE BOULEVARD~ N4FORT WAYNEIN46825~

PO113UN4.5PEEN9780446360265B6NEUFELDT, VICTORIAB300B4P~ PIDFDIC WEBSTERS NEW WORLD~ ACKIA3*UN****BIACKAC~ SCH3UNWHMO08020041231*000~

PO122UN15.99PEEN9780767928427B6CLAPTON, ERICB300~ PIDFCLAPTON THE AUTOBIOGRAPHY~ ACKIR0*UN****BIACKCW~ SCH0UNWHMO08020071009*AD~

PO131UN0PEEN9780099599531~ ACKIR0UN****BIACKKK~ SCH0UNWHCO08020041231*****NFC~

PO142UN4.5PEEN9780446360272B6NOT APPLICABLEB300B4P~ PIDF****WEBSTERS NEW WORLD THESAURUS~

ACKIA2*UN****BIACKAC~ SCH2UNWHMO08020041231****000~ CTT45~ SE260005~ GE15~ IEA1*000000005~

ahuggins-nhs commented 3 years ago

That's really interesting behavior.

Not sure if it should deserialize to null or blank string.

The file got mangled in the copy paste. Would you mind adding it as an attached TXT file? IT should be reproducible with what you sent, and that should help track down the bug.

Makes me wonder if this is impacting our production use of the library.

Mohmedvaid commented 3 years ago

Absolutely, I have created a formatted the file and added line breaks so it readable. I have attached both formatted and non formatted file testFormatted.txt testFormatted.txt

ahuggins-nhs commented 3 years ago

@Mohmedvaid I think I've identified the issue, it is reproducible. Thinking through how to fix it.

ahuggins-nhs commented 3 years ago

@Mohmedvaid It occurs to me the element is actually missing, as in PO109:PO103['UN'] in that third occurrence there is no PO109 element. I'm not sure how that should be solved. If there were an empty PO109 element there, I think that it would deserve to be an undefined or empty string.

I could use help understanding what to expect here when the data element is completely missing from the document. It's simply not going to be queryable at that point, so we should discuss appropriate behavior of the element selector search implementation.

ahuggins-nhs commented 3 years ago

And it's a good thing to discuss, because this behavior could impact re-implementations of the element selector language, as in the js-edi project.

Mohmedvaid commented 3 years ago

Great! I would say returning null would be an option here when the data is missing here. e.g I would expected following code { author: FOREACH(PO1)=>PO109:PO103['UN']} should return [{author:A1},{author:A2}, {author:null}, {author:A4}] when the third PO1 line is missing 'Author'

In order to work around this issue i have implemented following code.

First i get all the products id NOTE: this is an example when theres only one document in the edi file. let getAllProductId = { product_id: 'FOREACH(PO1)=>PO107' } let allProducts = [] interchange.functionalGroups.forEach(group => { group.transactions.forEach(transaction => { allProducts.push(transaction.toObject(getAllProductId)) }) })

which return four objects pushed into allProducts array [{product_id:101}, {product_id:102}, {product_id:103}, {product_id:104}]

then for each product allProducts.forEach(doc => { doc.forEach(product => { mapProductData = { author: 'PO109:PO107['${product.product_id}']', } interchange.functionalGroups.forEach(group => { group.transactions.forEach(transaction => { let parsedData = transaction.toObject(mapProductData) products.push(parsedData) }) }) }) })

ahuggins-nhs commented 3 years ago

@Mohmedvaid That's a great work-around, nice code.

I'm not sure I would agree it should return anything. If I look to other selection languages, like XPath or CSS selectors, and query an XML or HTML document, I think I'd get a similar result to what's already happening with the element selector query.

Essentially what the query is saying is "For every PO1 segment, return me the existing element 09 where it's immediate sibling element 03 equals 'UN'". If we were querying a table and swapped the words for "row" and "column" or we did XML and swapped it for "node" and "value", given a completely missing value we would end up in the same place we are now, simply no data for the missing.

This is probably solvable if we can agree on what the algorithm should be. I believe the following questions need more input from other users, as this is potentially a breaking change.

Should we instead be returning results by segment instead of by element, regardless of existence, and marking them somehow?
Should this behavior apply only to transformation contexts and not to straight queries executed intentionally in code?
What should be returned in the place of a real value if no element exists?

My thoughts on # 1 are no, we should not be looking at results at the segment level. The element selector language, I think, should be just a language for selecting elements. For # 2, I think signaling that transformations are different and have different considerations than just a single, straight query could be a graceful way to handle it.

Regarding # 3, I'm also not happy with null in general. I'm super-biased on this, I think null is a garbage value that has infected JavaScript APIs and made it confusing for developers, especially when it comes to Web APIs. It's an object, but it's not, TypeScript depending on settings is totally okay with null and undefined being the same thing, but hey, you can't guarantee type safety with null. Wherever I can return undefined instead of null, I do. If no real value exists, I would be happier with returning any other suggestion than null because of my biases. Sorry for the rant.

Mohmedvaid commented 3 years ago

Apologies on the late reply

Totally makes sense to returning result based on elements as you have provided a good point
For the straight queries, I think the current behavior should be fine as it returns null if something is missing but could be updated to return undefined
and yes I agree, null will be polluting the results. undefined can be a better option here.

ahuggins-nhs commented 3 years ago

@Mohmedvaid Nope, that's not a late reply. 😄 THIS is a late reply. I'm working on a fix.

ahuggins-nhs commented 3 years ago

@Mohmedvaid I think I got it fixed in a new PR. I've decided to pass an empty string because it was easier that way. Not sure if I'll do that exact thing over in js-edi, but I probably will as it actually works pretty nicely.

ahuggins-nhs commented 3 years ago

@Mohmedvaid you can also try it out to be sure, I published it as a beta. https://www.npmjs.com/package/node-x12/v/beta

Mohmedvaid commented 3 years ago

Thank you @ahuggins-nhs! Its looking great. I tested on my end and the empty strings are looking a little better then the query string that was passed in. Great work!

aaronhuggins commented 3 years ago

@mohmedvaid may I include the EDI doc you provided as part of the test suite under the license for this library?

Mohmedvaid commented 3 years ago

definitely go for it.

aaronhuggins / node-x12

Parser does not return null value for some data instead returns value from the next segment #23