Closed daniel-hauser closed 2 months ago
@daniel-hauser If I scrape for 100 days manually, it will be duplicate transactions? I do it quarterly because I have international transactions and I pay for them quarterly . Is there any way to use the old hash?
@roisec that's a good point. Currently the new hash is opt-in, the problem will be when I'll make it the default. If you are using the sheets storage, I can leave the old hash only for the duplicate check - this will skip adding transactions that exist with the old hash. The other option is to permanently have the hash type configurable, but I prefer not to do it unless we must
What do you mean for duplicate only? you can add your hash in addition for the current one. If the hash is same then use the new hash. I just want to support old one without duplicate transactions, including for old transactions with long periods, I don’t want to manually remove transactions.
What do you mean for duplicate only?
The sheets storage loads all hashes, then saves only transactions that pass the duplicate test. This means that i write new transactions only if their old hash AND new hash are not found, no duplication will be written
Looks great, do you need add a fix?
@daniel-hauser
@roisec - No, this is the current implementation
Can you disable the depreciation message?
@daniel-hauser
The current hash algorithm was designed to be compatible with the caspion hash, but this hash causes duplications for subsequent transactions with same date, price and description.
Since most scrapers (at least the ones i use) has an
identifier
, the new hash will be"date_companyId_accountNumber_chargedAmount_identifier"
. ifidentifier
is falsy,"description_memo"
will be used as a fallback.The new hash will be available on the
uniqueId
field of the transaction object.Strategy
This change might require manual deduping of transactions, therefore the new hash will be opt-in using a
TRANSACTION_HASH_TYPE
env var with the value"moneyman"
.Since the default scrape window is 10 days, we will add a deprecation message in the sent messages with link to this issue. After at least 30 days, the default hash will be changed to the new hash.
Impacted scrapers
The scrapers that currently use the old
hash
are:hash
field as theimport_id