Open bradenneal1 opened 4 years ago
I'm not sure about this one. Is there somewhere in the spec that says it's ok to have newlines in these locations and not others? Your suggested changes are simple enough, and making your suggested changes does indeed mean that you can parse a message with new lines in it, but I'm not clear on whether the mt103 message in question is valid with new lines in it, or that your suggested placements for new lines represents all the cases where this would be a problem. Do you have a spec I can reference for confirmation?
I ask because the placement of the \s*
bits seems strangely arbitrary. You've got one after every section except 5
, and they only appear after a header but not between sections
If this is valid:
{1:F01ASDFJK20AXXX0987654321}
{2:I103ASDFJK22XXXXN}
{4: :20:20180101-ABCDEF :23B:GHIJ :32A:180117CAD5432,1 :33B:EUR9999,0 :50K:/123456-75901 SOMEWHERE New York 999999 GR :53B:/20100213012345 :57C://SC200123 :59:/201001020 First Name Last Name a12345bc6d789ef01a23 Nowhere NL :70:test reference test reason payment group: 1234567-ABCDEF :71A:SHA :77B:Test this
-}
Is this not?
{1:F01ASDFJK20AXXX0987654321}
{2:I103ASDFJK22XXXXN}
{
4: :20:20180101-ABCDEF :23B:GHIJ :32A:180117CAD5432,1 :33B:EUR9999,0 :50K:/123456-75901 SOMEWHERE New York 999999 GR :53B:/20100213012345 :57C://SC200123 :59:/201001020 First Name Last Name a12345bc6d789ef01a23 Nowhere NL :70:test reference test reason payment group: 1234567-ABCDEF :71A:SHA :77B:Test this
-}
Might it be better to just message.replace("\n", "")
before parsing it, or is that likely to break things elsewhere? Until I'm certain, I'm not keen on making this change. If you have something I can reference to be sure, that'd go a long way toward helping me figure this out.
I don't have a specification to provide unfortunately.
I initially was using message.replace("\n", "")
, but became unstuck when parsing tags which contain more than 1 component. For example, if the above message was formatted:
{1:F01ASDFJK20AXXX0987654321}
{2:I103ASDFJK22XXXXN}
{4:
:20:20180101-ABCDEF
:23B:GHIJ
:32A:180117CAD5432,1
:33B:EUR9999,0
:50K:/123456-75901
SOMEWHERE
New York
999999
GR
:53B:/2010021301234
:57C://SC200123
:59:/201001020
First Name Last Name
a12345bc6d789ef01a23
Nowhere
NL
:70:test reference
test reason
payment group:
1234567-ABCDEF
:71A:SHA
:77B:Test this
-}
Both 50K
and 59
tags follow a format of Account, Name1, Name2, Address, City/Postal Code. With the newline characters removed, there is no way to determine where "Account" finishes and "Name1" starts etc. Keeping the newlines (and making the parser newline insensitive) allows message.ordering_customer.split('\n')
to identify the individual components.
You've got one after every section except 5
That's an oversight on my behalf. I would consider a message with trailing whitespace still valid (but have simply never seen one)
Alright I've had a conversation with some more financially-minded (as opposed to software like me) -people and it looks like line breaks are common in a message, so I'm going to make this change.
Do you perhaps have a few test messages I can use to ensure that everything works as-expected? All of the messages I have access to have no line breaks.
The regular expression
MESSAGE_REGEX
does not allow whitespace (or newlines) between each header. For example, if the testMESSAGE_1
is defined as:It does not parse:
Redefining the regex to accept whitespace characters between headers:
solves the issue