jf-tech / omniparser

omniparser: a native Golang ETL streaming parser and transform library for CSV, JSON, XML, EDI, text, etc.
MIT License
931 stars 68 forks source link

EDI Parsing #190

Closed bjordan2010 closed 1 year ago

bjordan2010 commented 1 year ago

Hello. I have a particular EDI case that if resolvable will add another unique example to your docs. In the example file below a unique entity is defined from the INS segment to the DTP segment(s). Therefore, there are 2 unique entities below.

There can be many DTP segments and they are usually the last segments in the entity after the HD segment. However, as you can see with the first entity in this example below, there is an extra DTP segment after the REF1L segment. Is it possible to define a schema with a DTP segment after REF1L and it is optional (0 or 1) and also define DTP segment(s) [0 or more] following the HD segment?

For after REF*0F I use { "name": "DTP", "min": 0, "max": 1 }. For after HD I used the one below. But I get the error Error: input '' at segment no.1 (char[1,1]): segment 'ISA' needs min occur 1, but only got 0. If I remove the first DTP segment from the schema and remove the first entity then it works. So it doesn't like when the that first DTP segment is missing even though I said min 0 and max 1.

{
    "name": "DTP",
    "min": 0,
    "max": -1,
    "elements": [
        {
            "name": "dateTimeQualifier", "index": 1
        },
        {
            "name": "dateTimePeriod",
            "index": 3
        }
    ]
}
ISA*00*          *00*          *30*Entitled FTI     *30*631157085      *200214*0933*^*00501*000000001*1*P*:~
GS*BE*Entitled FTI*631157085*20200214*09333565*1*X*005010X220A1~
ST*834*0001*005010X220A1~BGN*00*ENTITLED LLC*20200214*09333595****4~
REF*38*00000~DTP*007*D8*20200214~N1*P5*ENTITLED LLC*FI*770594061~
N1*IN*Prescription Benefits Inc*FI*631157085~
INS*Y*18*030*XN*A***FT~
REF*0F*555446666~
REF*1L*B196~
DTP*336*D8*20131210~
NM1*IL*1*James*Fredericks*G***34*555446666~
PER*IP**CP*9183333339~
N3*14752 Zoo Avenue~
N4*Lincolnville*OK*76234~
DMG*D8*20000101*M*S~
HD*030**PDG*ENTE0100*EMP~
DTP*348*D8*20200101~
INS*N*19*030*XN*C****N*N~
REF*0F*555446666~
REF*1L*B196~
NM1*IL*1*Jackson*Pollock*S***34*123446666~
N3*142 Bumpkin Road~
N4*Lakeville*SD*79004~
DMG*D8*20000101*F~
HD*030**PDG*ENTA8888~
DTP*348*D8*20200401~
DTP*344*D8*20200401~
jf-tech commented 1 year ago
  1. Can you either post a complete input and your current testing schema here or email to me at jf.tech.llc at gmail? It'd be a low easier for me to debug and make suggestion

  2. I don't know your specific EDI specification and its hierarchical and loop structure so it's a bit guessing game and hard to be concrete. But from what you said:

    a) After INS there are multiple REF segments. And if a particular REF has a 2nd element of 1F then it can be optionally followed by a DTP segment. b) After HD segment, there could be zero or more DTP segments.

    My questions are: 1) Are those two REFs (one with 0F and one with 1F) optional or mandatory? Any repeats of these REFs more than the two instances (0F and 1F)? 2) Is HD segment mandatory (min=max=1) or optional? 3) Are any of the segments after REF/DTP and before HD mandatory? Such as NM1, N3, DMG, etc, any of these mandatory?

jf-tech commented 1 year ago

We're following up offline, but want to post progress/updates here for future references:

Issues we've identified so far:

Issue remaining to be clarified/resolved:

jf-tech commented 1 year ago

Final update, the last issue has been resolved: turns out the we only need to extra date data from HD/DTP seg whose index 1 element value being 348 thus a simple xpath query HD/FTP[qualifier = 348]/date returns the unique segment for extraction.