daniel-hauser / moneyman

Automatically save transactions from all major Israeli banks and credit card companies, using GitHub actions (or a self hosted docker image)
56 stars 39 forks source link

Braking Change: transaction hash #268

Closed daniel-hauser closed 2 months ago

daniel-hauser commented 2 months ago

The current hash algorithm was designed to be compatible with the caspion hash, but this hash causes duplications for subsequent transactions with same date, price and description.

Since most scrapers (at least the ones i use) has an identifier, the new hash will be "date_companyId_accountNumber_chargedAmount_identifier". if identifier is falsy, "description_memo" will be used as a fallback.

The new hash will be available on the uniqueId field of the transaction object.

Strategy

This change might require manual deduping of transactions, therefore the new hash will be opt-in using a TRANSACTION_HASH_TYPE env var with the value "moneyman".

Since the default scrape window is 10 days, we will add a deprecation message in the sent messages with link to this issue. After at least 30 days, the default hash will be changed to the new hash.

Impacted scrapers

The scrapers that currently use the old hash are:

roisec commented 2 months ago

@daniel-hauser If I scrape for 100 days manually, it will be duplicate transactions? I do it quarterly because I have international transactions and I pay for them quarterly . Is there any way to use the old hash?

daniel-hauser commented 2 months ago

@roisec that's a good point. Currently the new hash is opt-in, the problem will be when I'll make it the default. If you are using the sheets storage, I can leave the old hash only for the duplicate check - this will skip adding transactions that exist with the old hash. The other option is to permanently have the hash type configurable, but I prefer not to do it unless we must

roisec commented 2 months ago

What do you mean for duplicate only? you can add your hash in addition for the current one. If the hash is same then use the new hash. I just want to support old one without duplicate transactions, including for old transactions with long periods, I don’t want to manually remove transactions.

daniel-hauser commented 1 month ago

What do you mean for duplicate only?

The sheets storage loads all hashes, then saves only transactions that pass the duplicate test. This means that i write new transactions only if their old hash AND new hash are not found, no duplication will be written

roisec commented 1 month ago

Looks great, do you need add a fix?

roisec commented 1 month ago

@daniel-hauser

daniel-hauser commented 1 month ago

@roisec - No, this is the current implementation

roisec commented 1 month ago

Can you disable the depreciation message?

roisec commented 1 month ago

@daniel-hauser