crs4 / hl7apy

Python library to parse, create and handle HL7 v2 messages.
http://crs4.github.io/hl7apy/
MIT License
215 stars 85 forks source link

Question: How to deal with MSH missing delimiters #109

Closed 1hamzaiqbal closed 1 year ago

1hamzaiqbal commented 1 year ago

Here is the example message

message = """MSH|^~\&|SendApp|SendFac|RecApp|RecFac|20050121155132||ORU^R01|3-1|P|2.7 PID|1||007801|123456789|Last Name^First Name^I||19611215|F|||Address 1^Address 2^City^CA^91355||8007060266|||||Account#"""

Error message: Traceback (most recent call last): File "//hl7_convert/hl7_parse_test.py", line 39, in m = parser.parse_message(contents) File "/hl7_convert/hparse/lib/python3.9/site-packages/hl7apy/parser.py", line 81, in parse_message m = Message(name=message_structure, reference=reference, version=version, File "/hl7_convert/hparse/lib/python3.9/site-packages/hl7apy/core.py", line 1919, in init super(Message, self).init(name, None, reference, version, File "/hl7_convert/hparse/lib/python3.9/site-packages/hl7apy/core.py", line 1790, in init super(Group, self).init(name, parent, reference, version, validation_level, traversal_parent) File "/hl7_convert/hparse/lib/python3.9/site-packages/hl7apy/core.py", line 630, in init check_version(version) File "/hl7_convert/hparse/lib/python3.9/site-packages/hl7apy/init.py", line 90, in check_version raise UnsupportedVersion(version) hl7apy.exceptions.UnsupportedVersion: The version 2.7 PID is not supported

By adding pipe delimiters to the end of the MSH segment (below)

message = """MSH|^~\&|SendApp|SendFac|RecApp|RecFac|20050121155132||ORU^R01|3-1|P|2.7|||| PID|1||007801|123456789|Last Name^First Name^I||19611215|F|||Address 1^Address 2^City^CA^91355||8007060266|||||Account#"""

The message is parsed correctly. How do I have the PID segment parsing in the way I want (ignoring the fields without delimiters that we aren't provided)? Is this something that I should do by making a message profile? The only segment which doesn't seem to conform to default HL7 2.7 message format is the MSH segment.

svituz commented 1 year ago

Hi @1hamzaiqbal,

TLDR you need to separate the segments with the correct separator (\r).

message = """MSH|^~\&|SendApp|SendFac|RecApp|RecFac|20050121155132||ORU^R01|3-1|P|2.7\rPID|1||007801|123456789|Last Name^First Name^I||19611215|F|||Address 1^Address 2^City^CA^91355||8007060266|||||Account#"""

Long explanation: In the first example, an Exception is thrown because you're assigning an illegal value (2.7 PID ) to MSH.12 (version) field. which is obviously not a valid version value (2.1, 2.2, 2.7, etc...).

In the second example, you're not getting the same error because you added the field separator, so the version value 2.7 is ok, but the behavior is not what you expect. Indeed you're not creating a PID segment but you only have the MSH segment with PID value assigned to MSH.16.

Indeed, if you run:

message = """MSH|^~\&|SendApp|SendFac|RecApp|RecFac|20050121155132||ORU^R01|3-1|P|2.7|||| PID|1||007801|123456789|Last Name^First Name^I||19611215|F|||Address 1^Address 2^City^CA^91355||8007060266|||||Account#"""
m = parse_message(message)
print(m.children)
print(m.msh.msh_16.value)

you get

[<Segment MSH>]
 PID

Correct version:

message = """MSH|^~\&|SendApp|SendFac|RecApp|RecFac|20050121155132||ORU^R01|3-1|P|2.7\rPID|1||007801|123456789|Last Name^First Name^I||19611215|F|||Address 1^Address 2^City^CA^91355||8007060266|||||Account#"""
m = parse_message(message)
print(m.children)
print(m.ORU_R01_PATIENT_RESULT.ORU_R01_PATIENT.PID.value)
[<Segment MSH>, <Group ORU_R01_PATIENT_RESULT>]
PID|1||007801|123456789|Last Name^First Name^I||19611215|F|||Address 1^Address 2^City^CA^91355||8007060266|||||Account#
1hamzaiqbal commented 1 year ago

That makes a lot of sense. One thing I guess I wasn't understanding is that you need the (\r) when representing HL7 messages as a string, and that also when reading a HL7 v2.x (.txt) file you need to add the delimiters as well.

For example this throws the same error to this chem_sample.txt file as what I had before where I didn't add the (/r)

with open('chem_sample.txt') as f:
    contents = f.read()

but this works in the same way as you had represented in the correct string representation:

with open('chem_sample.txt', 'r') as file:
    lines = file.readlines()
    contents = '\r'.join(lines)

I know this works within the example case that I have, but is this the correct way to read in txt files that I want to parse? Or am I overcomplicating this?

Sample file: chem_sample.txt

svituz commented 1 year ago

Either that or

with open('chem_sample.txt', 'r') as file:
    m = file.read()
    m.replace('\n\r', '\r').replace('\n', '\r')

but yes, that's the correct way