railslove / cmxl

your friendly MT940 SWIFT file parser for bank statements
http://railslove.com
MIT License
46 stars 25 forks source link

Chargeback transfers fail to be parsed #18

Closed irrenhaus closed 6 years ago

irrenhaus commented 6 years ago

In case of chargebacks there are additional informations in the :61: fields (OCMT and CHGS data). The current regular expression completely fails in parsing these :61: lines because of that additional data.

Changing the regular expression of Cmxl::Fields::Transaction to /^(?<date>\d{6})(?<entry_date>\d{4})?(?<storno_flag>R?)(?<funds_code>[CD]{1})(?<currency_letter>[a-zA-Z])?(?<amount>\d{1,12},\d{0,2})(?<swift_code>(?:N|F).{3})(?<reference>NONREF|.{0,16})((?:\/\/)(?<bank_reference>[^\r\n]*))?((?:[\r\n])?((?:\/OCMT\/)(?<ocmt>[^\/]*)(?:\/)(?:\/CHGS\/)(?<chgs>[^\/]*)(?:\/)))?/i fixes that and additionally gives the OCMT and CHGS fields.

The changed part is: the whole bank reference group is now optional and can contain any characters except for CR-LF. After that there may be an additional block separated by CR-LF containing /OCMT/3a15num with an optional slash at the end followed by /CHGS/3a15num with an optional slash at the end.

bumi commented 6 years ago

thanks for reporting that issue. Do you have an example of the MT940 file? And do you want to make an PR with your change? that would be great, then we can test it there and merge it.

irrenhaus commented 6 years ago

Sorry, neither can I give you an example MT940 file (these are bank statements of our clients) nor can I put in the time to do an actual PR, since I'm not allowed to do that during my work time. However, I can give the single line in question:

:61:1803080308DR75,99NRTINONREF
/OCMT/EUR70,13//CHGS/EUR5,86
:61:1803090309DR185,67NRTINONREF
/OCMT/EUR183,77//CHGS/EUR1,90/

(Both seen in real world MT940 files)

I managed to fix the issue for myself using

module Cmxl
  module Fields
    class Transaction
      # Fixed regex to match the OCMT & CHGS fields so that direct debit chargebacks can be parsed
      self.parser = /^(?<date>\d{6})(?<entry_date>\d{4})?(?<funds_code>[a-zA-Z])(?<currency_letter>[a-zA-Z])?(?<amount>\d{1,12},\d{0,2})(?<swift_code>(?:N|F).{3})(?<reference>NONREF|.{0,16})((?:\/\/)(?<bank_reference>[^\/]*))?((?:\/OCMT\/)(?<ocmt>[^\/]+)(?:[\/]?))?((?:\/CHGS\/)(?<chgs>[^\/]+)(?:[\/]?))?/i
    end
  end
end

As a monkey patch. It works because Cmxl strips the newline. However, be aware of the fact that this changes the way the bank_reference field behaves: Before this patch this field always contained either an empty string or the actual bank reference. After this patch the value is either nil or the actual bank reference. This is because of the way the original regex was built.

For information on the OCMT / CHGS stuff see https://www.kontopruef.de/mt940s.shtml (german), section :61:, subfield 9 "Ursprungsbetrag und Gebührenbetrag".

bumi commented 6 years ago

ok. thanks for the specific line of the statement, that's already helpful. Can you tell which bank it its?

Maybe if you get time in the future to optimize it during you work, let me know. Then we can add tests for that case and if your change can make it into a new release you can be sure your version is upgradeable.

Thanks for reporting, if anyone else sees similar statements please comment on this issue.

Uepsilon commented 6 years ago

fixed ✔️