wolph / mt940

A library to parse MT940 files and returns smart Python collections for statistics and manipulation.
https://mt940.readthedocs.org/en/latest/
BSD 3-Clause "New" or "Revised" License
94 stars 50 forks source link

Transaction details parsing error if content is unfortunately wrapped #56

Closed dr-duplo closed 6 years ago

dr-duplo commented 6 years ago

Hi guys,

I discovered a problem with the parsing method of the transaction model. Actually this error prevents me fetching transactions with fints. In the example below (obfuscated) the transaction details ":86:" contain an ISO timetamp which is unfortunately wrapped to the next line. The ":26:" is no tag, but the minutes of the ISO timestamp.

:61:000000D8,47NMSCNONREF :86:106?000000/661?20EREF+VZ0000000000000000?21MREF+000000?22CRED+XX0 000000000000000?23ABCDEFGHIJKLMNOPQRSTUVW?24/PL 12-09-2014T16 :26:37 Fo?25lgenr. 007

The parser raises an exception since it doesn't know the tag ":26:" (it doesn't exist either). The root cause (I guess) is the method used to extract the tags (models.py:333). Basically it searches for lines starting with ':XX:' which includes also not existing 'tags'.

It better way may be to search for known tags by listing them in the regex like: r'^:(?P<full_tag>61|86|...|...):'

This still would lead to possible parsing errors for tags below 60, so maybe another parsing strategy is more appropriate.

dr-duplo commented 6 years ago

I just discovered that this line could be the cause.

https://github.com/WoLpH/mt940/blob/c8e094fdab81fa4435a681db3f7275b290d42199/mt940/models.py#L306

If I comment it out, the whole thing works again.

wolph commented 6 years ago

While that might work for this case, the strip is correct. The :26: should be part of the timestamp so it does need to be stripped. In this case though... perhaps an lstrip('\n') might be needed too

dr-duplo commented 6 years ago

58