crs4 / hl7apy

Python library to parse, create and handle HL7 v2 messages.
http://crs4.github.io/hl7apy/
MIT License
215 stars 85 forks source link

Support of Escape Sequences IN TEXT FIELDS (Formatting codes, Formatted text, multiple character sets, Special character etc) #96

Open shamim40 opened 2 years ago

shamim40 commented 2 years ago

. Does HL7apy support ESCAPE SEQUENCES (IN TEXT FIELDS) handling? If yes, please let me know the reference docs. I am working on ORU^R01 HL7 message to generate a narrative text to include in DICOM SR (Structured Report).

Below are the description as per HL7 standard, CH02 (section 2.7)

Formatted text .sp => End current output line and skip vertical spaces. is a positive integer or absent.

.br Begin new output line. Set the horizontal position to the current left margin and increment the vertical position by 1.

.fi Begin word wrap or fill mode. This is the default state. It can be changed to a no-wrap mode using the .nf command.

.nf Begin no-wrap mode.

.in Indent of spaces, where is a positive or negative integer. This command cannot appear after the first printable character of a line.

.ti Temporarily indent of spaces where number is a positive or negative integer. This command cannot appear after the first printable character of a line.

.sk < number> Skip spaces to the right.

.ce End current output line and center the next line.

Formatting codes: \H\ start highlighting \N\ normal text (end highlighting) \F\ field separator \S\ component separator \T\ subcomponent separator \R\ repetition separator \E\ escape character \Xdddd...\ hexadecimal data \Zdddd...\ locally defined escape sequence

Escape sequences supporting multiple character \Cxxyy\ single-byte character set escape sequence with two hexadecimal values, xx and yy, that indicate the escape sequence defined for one of the character repertoires supported for the current message (i.e., ISO-IR xxx). \Mxxyyzz\ multi-byte character set escape sequence with three hexadecimal values, xx, yy and zz. zz is optional.

svituz commented 2 years ago

Hi @shamim40, hl7apy escapes special HL7 characters when your value includes some. It also has support for text highlighting

Here is some examples:

f = Field("ODT_3")
f.value = "|important value" # escapes field separator
f.to_er7() 
>>> '\\F\\important value'

f = Field("ODT_3")
f.value = "^important value" # escapes component separator
f.to_er7() 
>>> '\\S\\important value'

f = Field("ODT_3")
f.value = "&important value" # escapes subcomponent separator
f.to_er7() 
>>> '\\T\\important value'

f = Field("ODT_3")
f.value = "~important value" # escapes repetition separator
f.to_er7() 
>>> '\\R\\important value'

f = Field("ODT_3")
f.value = "#important value" # escapes truncation separator
f.to_er7() 
>>> '\\L\\important value'

As said, It does also support highlighting in this way

f = Field("ODT_3")
f.value = ST('HIGHLIGHTEDTEXTIMPORTANT', highlights=((0, 11), (18, 24)))
f.to_er7() 
>>> '\\H\\HIGHLIGHTED\\N\\TEXTIMP\\H\\ORTANT\\N\\'

Regarding the formatting sequences like .sp, .nf it just treats them as part of the strings.

f = Field("ODT_3")
f.value = '.br new line .sp '
f.to_er7() 
>>> '.br new line .sp '

Currently, it does not support the other escaping sequences

f = Field("ODT_3")
f.value = '\Xdddd\'
f.to_er7() 
>>> '\E\Xdddd\E\'

How to format the strings for rendering, according to the CH2 specification, is in charge of your application.

Hope that helps

alexyuisingwu commented 2 years ago

I'm not sure if something's changed since this comment, but it looks like &, \n, and \r are not escaped anymore

For example, I tried running the example you provided above, and got this:

f = Field("ODT_3")
f.value = "&important value" # escapes subcomponent separator
f.to_er7() 
>>> '&important value'

Testing a few other special characters shows that while some are escaped, &, \n, and \r are not:

b = Component()
b.value = 'a|b~c\\d^e&f\ng\rh\fi\\'
b.to_er7()
>>> 'a\\F\\b\\R\\c\\E\\d\\S\\e&f\ng\rh\fi\\E\\'
svituz commented 2 years ago

@alexyuisingwu It seems you're right and probably this is also related to #100. I'll need to figure it out what's happening

alexyuisingwu commented 2 years ago

@svituz Thanks! If it helps, I discovered the issue while trying to add a component that happened to contain an & character. I got an error about adding too much children, and it looked like it was parsing the component value into subcomponents delimited by the & rather than text.

alexyuisingwu commented 2 years ago

@svituz The error in question:


from hl7apy import VALIDATION_LEVEL
from hl7apy.core import Message, Component, Field

m = Message('SIU_S12', validation_level=VALIDATION_LEVEL.STRICT)
m.sch.sch_7.sch_7_1 = 'a&b'

hl7apy.exceptions.MaxChildLimitReached: Cannot add <SubComponent ST>: max limit (1) reached for <Component CE_1 (IDENTIFIER) of type ST>