martinohansen / ynabber

Ynabber reads bank transactions from GoCardless (formerly Nordigen) and wrties them into YNAB
GNU General Public License v3.0
21 stars 11 forks source link

Posibility to filter uncleared transactions #75

Closed ingvarso closed 6 months ago

ingvarso commented 7 months ago

One of my banks are Sparebank 1 Østlandet. They are making uncleared transactions available, but I would like to filter those out, since they don't contain proper payee information. It seems like the "debtorAccount" is missing on the uncleared transactions, so if it could be configurable to ommit transactions without "debtorAccount", that would be a great feature.

hpernu commented 7 months ago

I guess a feature to skip transactions without debtorAccount could work. The problem is (I think) that if there is a creditorAccount, the debtorAccount is missing regardless. In addition, there might be internal bank transactions(such as banking fees) which mignt not have this information either. Can you view the actual json via the GoCardless web interface and see if there is any other field that is clearly present in these unfinished transaction/not present in them but in everything else? I am looking for a more reliable way to detect the distinction between these.

In a pinch, you could perhaps share the full json so somebody else can look at it, but this will, of course, reveal personal information.

ingvarso commented 7 months ago

I have checked a bit more in the GoCardless Web Interface, and here are some more details.

Both transactions with missing and full payee information are listed as "booked" in the JSON structure, so I guess it is not really pending, but just missing payee information.

Here are the transaction with missing payee information:

            {
                "transactionId": "enc!!NLDznR9ZxojdKQrb6",
                "bookingDate": "2024-05-08",
                "valueDate": "2024-05-08",
                "transactionAmount": {
                    "amount": "5.00",
                    "currency": "NOK"
                },
                "creditorAccount": {
                    "iban": "NO60181011XXXXX",
                    "bban": "181011XXXXX"
                },
                "remittanceInformationUnstructured": "Lønn",
                "remittanceInformationUnstructuredArray": [
                    "Lønn"
                ],
                "proprietaryBankTransactionCode": "R_013",
                "internalTransactionId": "ba2174db82cafdec810be63adea1a406"
            },

And here is one with normal payee information:

            {
                "transactionId": "enc!!aGRGjj4qLb6a3=",
                "bookingDate": "2024-05-03",
                "valueDate": "2024-05-03",
                "transactionAmount": {
                    "amount": "500.00",
                    "currency": "NOK"
                },
                "creditorAccount": {
                    "iban": "NO60181011XXXX",
                    "bban": "181011XXXX"
                },
                "debtorAccount": {
                    "bban": "18134891358"
                },
                "remittanceInformationUnstructured": "Informasjon om betalingen",
                "remittanceInformationUnstructuredArray": [
                    "Informasjon om betalingen"
                ],
                "proprietaryBankTransactionCode": "R_197",
                "internalTransactionId": "f8a883f75680cca3d6de6b09d051d00c"
            },

I have scrambeled all personal details.

These are both inflows. I don't currently have any outflows with missing payee info. I can add those later.

The reason I would like to filter out these transactions is that when imported to YNAB they are:

  1. Showing up with the last assigned payee that had same information. So when someone sends mony by VIPPS, the remittanceInformationUnstructured says "Straksbetaling", and YNAB then replaces it automatically with the last name/payee that was used with "Straksbetaling"
  2. The transactions tend to be duplicated (not sure if this is allways the case) when the payee information is updated at the bank

Having a second thought, maybe there is better solution to this issue than filtering out the transactions... I see in the config.go there is possiblity to set environment variable "NORDIGEN_PAYEE_SOURCE", but I didn't play with that yet. Mabye leaving the payee field empty if there is no payee info available would solve the problem.

hpernu commented 7 months ago

Yes, I think I actually implemented the alternative NORDIGEN_PAYEE_SOURCE because of a similar issue i.e. actually having a valid creditor/debitor name in JSON structure for me. The original algorithm in ynabber was useless for me. And yes, sometimes the payee information turns out empty for me as well for some transactions.

FYI: Ynab itself contains some kid of heuristics in its end so we cannot make everything automatic. Quite often it guesses the payee and often gets it wrong. So you still have to validate the imported transactions but, considering this is personal finance, there usually are not that many. I usually import my transactions within an hour anyway and have close to 20 accounts some of which are credit cars. Especially the credit card transactions generally lag for a couple of days i.e. the pending transactions are not imported before they are settled but this may differ between banks. We really cannot guess every conceivable usage in the importer level anyway. So better to just import them and then edit in YNAB interface. The same transaction should not be imported twice anyway even if you delete one completely (unless we change the id generation logic).

martinohansen commented 7 months ago

@ingvarso it sounds like your bank provides you the payee details in the remittanceInformationUnstructured field since you are not provided a CreditorName field. The fact that you see duplicate entries indicates that the bank changes the remittanceInformationUnstructured field once the transaction has become "booked" aka got debtorAccount assigned. Is that correct?

If so, i think we need to create a mapper just for your bank that blocks transactions from being handled until they have a debtorAccount. We can do that but I would like to confirm if the remittanceInformationUnstructured does indeed change over time first.

ingvarso commented 6 months ago

I have been observing the issue a bit more closely for some days now. In the GoCardless API, it looks like the transactions are more or less always in the "booked" section, even if they are not fully cleared by the bank. The "uncleared" transactions are either missing debitorAccount (on incoming transactions) or creditorAccount (on outgoing transactions). The transactionId is changed, but the internalTransactionId is consistent. remittanceInformationUnstructured is also updated but this field contains the "memo"-field on outgoing transactions, and not payee info. From my perspective it could be a good solution to ignore the transactions if not both of debitorAccount and creditorAccount are present.

hpernu commented 6 months ago

See this: // PayeeSource is a list of sources for Payee candidates, the first method // that yields a result will be used. Valid options are: unstructured, name // and additional. // // unstructured: uses the RemittanceInformationUnstructured field // name: uses either the either debtorName or creditorName field // * additional: uses the AdditionalInformation field PayeeSource []string envconfig:"NORDIGEN_PAYEE_SOURCE" default:"unstructured,name,additional"

What happend if you simple set the payee source to what I am using i.e. 'name'? Will it skip the transactions until there is a payee or import with blank payee?

To be honest, considering how YNAB is supposed to be used, these transactions should perhaps be imported anyway but perhaps with uncleared status and some information missing unless you enter them manually.

ingvarso commented 6 months ago

What happend if you simple set the payee source to what I am using i.e. 'name'? Will it skip the transactions until there is a payee or import with blank payee?

There are only a very few transactions where the creditorName is present, and I found none with debitorName present. There seems to not be any consistent source of info. The remittanceInformationUnstructured are the most consistent source, even if it is not fully consistent.

To be honest, considering how YNAB is supposed to be used, these transactions should perhaps be imported anyway but perhaps with uncleared status and some information missing unless you enter them manually.

Yes, I do manual entry of all transactions. That is quite easy on expenses, but incoming money are a bit more unpredictable.

To make it even more complex, I see now that all debit card transactions are also missing creditorAccount information, so by excluding all transactions with missing either deiborAccount or creditorAccount that would also exclude all card transactions...

So I guess my next step should rather be to investigate why the transactions get duplicate even if the InternalTransactionID is consistent. My config for this bank includes NORDIGEN_TRANSACTION_ID=InternalTransactionId

On the other hand, this is in my opinion a bug/weakness in how the bank is presenting data to the GoCardless service, but I'm not sure how to move forward to report the bug or if they would ever prioritize to fix it...

hpernu commented 6 months ago

Good luck with presenting this as a bug to the bank.

I think it is rather unlikely they do anything about it. Although PSD2 is a standard, and every bank is obliged to implement it, they are free to choose much of the details themselves. As you can see, this is not very consistent. Furthermore, it is not in their business interests to actually make them more compatible. In fact, they're likely to put their tweaks anywhere they can in order to keep clients stuck with their bank.

I have used Ynabber with several Finnish banks and they all have some different quirks. But then I also implemented some of the features to be able to support these, for example getting the payee from name.

Even after that, I still have to manually edit some stuff. There is automation on the YNAB side as well and what we put on the payee is merely a suggestion. Often YNAB gets it wrong anyway but it is easy to correct as these transactions are waiting for approval. Getting it right everytime is not feasible: sometimes the payee is the same but different purpose. For example, you might a refund for something from your employer but also a salary.

But this is personal finance. I doubt the amount of transactions is very high. As I am continually importing transactions, I simply fix these manually. As long as the amount, direction(inflow/outflow) and account is right, this is enough information in majority of cases to correct these. Payee name is helpful but, as mentioned, not always right, and you still have to categorize everything.

If you need this handled automatically, try to get a consistent picture of what is happening, but keep in mind that not every case can be automated. As long as you get the date, account and amount right, and are continuously importing, you should get everything working with minimal manual input. Any extra data you get is somewhat unreliable and only helps you to categorize but you can't avoid some manual fixing.

martinohansen commented 6 months ago

So I guess my next step should rather be to investigate why the transactions get duplicate even if the InternalTransactionID is consistent. My config for this bank includes NORDIGEN_TRANSACTION_ID=InternalTransactionId

That is odd, we generate an import ID that YNAB uses to avoid duplicate transactions at import time, we use the account IBAN, transaction ID (in your case InternalTransactionId), transaction date (BookingDate), and amount to generate this ID. Does any of those fields change over time for you? If so that is why you get duplicates.

Please let me know if we can handle any of your fields better in another way and I will make a mapper for your bank which should handle it better.

ingvarso commented 6 months ago

Does any of those fields change over time for you? If so that is why you get duplicates.

Yes, I caught one today, and both InternalTransactionID and Booking date is updated.

Please let me know if we can handle any of your fields better in another way and I will make a mapper for your bank which should handle it better.

Thank you, cannot see any way at the moment, so I'm closing

martinohansen commented 6 months ago

Last idea 🙈

What if we make a feature to delay imports X days? It will make everything slower but for some it might be needed.

hpernu commented 6 months ago

Last idea 🙈

What if we make a feature to delay imports X days? It will make everything slower but for some it might be needed.

Or a conditional delay? I.e. if a transaction is missing important information (such as payee) do not import it until it is X days old.

This should be deciphered from the final to-be-imported transaction though as the payee may come from different places.