Unable to parse huge x12 documents

imsweb / x12-parser

A Java parser for ANSI ASC X12 documents.

Other

80 stars 44 forks source link

Unable to parse huge x12 documents #43

Open nddipiazza opened 3 years ago

nddipiazza commented 3 years ago

What is the biggest files you have been able to parse with this parser?

I need to parse 837 files with thousands of claims in them.

To get unblocked, I added a split837 method to the X12Reader which is doing a map reduce to take the huge x12 file and split into chunks. I split at child loops at the DETAIL loop.

Once the 837 is split into chunks, I just operate on those chunks separately using the normal parse method.

Anyone else have anything similar they had to do?

angelaszek commented 3 years ago

We haven't run into this issue since we don't process huge X12 documents. Handling large files better is something we would like to look into implementing. Unfortunately, it will be a little while before I will be able to work on that.

nddipiazza commented 3 years ago

OK I have a solution in place for the time being, but it is specific to 837's. it finds the detail loop which is the loop that can have 1000's of child loops, then splits the files into smaller files. then you can use the normal parse on the smaller files. when done processing the smaller files, you can then assemble it if you need to.

Could also use MapDB disk-based map instead of the heap datastructures, but that will be slower.

bitbythecron commented 11 months ago

OK I have a solution in place for the time being, but it is specific to 837's. it finds the detail loop which is the loop that can have 1000's of child loops, then splits the files into smaller files. then you can use the normal parse on the smaller files. when done processing the smaller files, you can then assemble it if you need to.

Could also use MapDB disk-based map instead of the heap datastructures, but that will be slower.

Hi @nddipiazza just curious, what sized documents were you processing when you ran into this issue? What errors were you seeing? And do you remember any details for how you chunked the files up into multiple smaller ones? Thanks in advance for any-and-all help here!

thesammiller commented 6 months ago

Hello @bitbythecron were you able to learn anything about this issue? I am processing 837s with multiple claims in each file, but I am only getting one claim loop for each file and I am otherwise seeing errors. Were you able to make any progress on chunking?