simonmichael / hledger

Robust, fast, intuitive plain text accounting tool with CLI, TUI and web interfaces.
https://hledger.org
GNU General Public License v3.0
2.85k stars 307 forks source link

Importing CSV: Creating complex `*.csv.rules` needs clarity #2208

Closed alecStewart1 closed 1 day ago

alecStewart1 commented 3 days ago

This could be a mixture of an issue and also a help request, but the result from this could be useful for other people.

House cleaning

hledger --version: hledger 1.34, mac-x86_64

Editor: Emacs with ledger-mode

Issue

I thought I could set up some fairly complex *.csv.rules for importing transaction data from my back with Tiller.

With the CSV rules I have (I'll provide a sample down below), I wind up with a lot of entries that look like so from hledger -f csv:my-tiller-data.csv print:

0024-06-26 Grabbled transaction name from Tiller dataset ; obviously the year is wrong
    ; has one tag but not the others, and you can see the double $
    expenses:unknown          $$7.99  ; kind:excess
    income:unknown           $$-7.99

Steps to reproduce

This is a bit odd as I don't really know if I can provide a sample CSV, but I can give a sample of the *.csv.rules that I have. Let's go with a few items that one might have

# Basic CSV rules
#

skip 1
newest-first

# the skipped field is "Category" because that whole column in Tiller dataset is empty for me
fields Date, Description, _, Amount, Account

date %Date
date-format %-m/%-d/%Y # not sure how this ends up printing the years as (example) 0024

currency $

# Get past stupid negative parsing crap, as transactions listed in the Tiller dataset are 
# inverted from what we want with a ledger system (withdraws are negative, deposits positive)
# Ex. if I spend $2.50 on a coffee somewhere, that transaction is -$2.50 in the Amount column in the Tiller dataset.

if %Amount ^-\$
   amount (%Amount)

if %Amount ^$
   amount %Amount

# Simple catchall, get all transactions that match the regex .[cC]offee*, tag the transaction as an excess expense, 
# and have account1 be expenses:foodstuff:coffee
# This doesn't work.

if %Description .[cC]offee*
   comment1 kind:excess
   account1 expenses:foodstuff:coffee

# Let's do some matchers for descriptions -> payee that seem to work
# However, unless account2

## If you have a card with your bank that has some cashback program
if %Description .Des\:cashreward*
   description Some Bank CashBack # payee name
   account1 revenue:rewards programs
   account2 assets:checking

## Kroger is a grocery store.
if %Description ^Kroger*
   description Kroger Grocers
   comment1 kind:necessity
   account1 expenses:foodstuff:groceries

if %Description ^Chevron*
   description Chevron | Gas
   comment1 kind:necessity
   account1 expenses:car:gas

# Let's follow up with the coffee example by adding a descripton -> payee
# This doesn't work
#

## Let's have a made up coffee shop that uses Square
## Square transactions seem to start with "Sq", then " *shop Name".
if %Description ^Sq\s\*ventti\sCoffee*
  description Ventti Coffee # I would assume this "compounds" with the earlier .[cC]offee* matcher

# Now let's try to set the accounts to zero out our transactions
# These don't work, but I'm again assuming rules will be merged with previous ones
# Ex. For Kroger one, since the credit card will be used for this, I assume the 2nd matcher in the
# set of the following will set the 2nd account to be liabilities:credit. It does not.

if %Account ^Checking\sAccount$
   account2 assets:checking
   comment2 bank:my bank  # tag so you can filter by bank

if %Account ^Some\sstupid\slong\sname\sfor\scredit\scard$
   account2 liabilities:credit
   comment2 bank:my bank

if %Account ^Savings\sAccount$
   account2 liabilities:savings
   comment2 bank:my bank

Expected output

I would expect, as an example:

2024-06-30 Kroger Grocers
    expenses:foodstuff:groceries $100.00   ; kind:necessity, bank:my bank
    liabilities:credit                        -$100.00

Conclusion

Maybe I've misread or missed some stuff in the docs, but from initial reading it seems like stuff like the example above would work.

simonmichael commented 2 days ago

Thanks for the report @alecStewart1. Can you add sample csv records, eg for the rules that don't work (and the headings).

alecStewart1 commented 2 days ago

Here's some samples that I fudged some stuff for security for privacy purposes. Entry/row/record 1 and 3 don't work, entry/row/record 2 and 4 will.

Date,Description,Category,Amount,Account,Account #,Institution,Month,Week,Transaction ID,Account ID,Check Number,Full Description,Date Added,,
6/28/24,"Big Company Name Des:payroll, Ln:Stewart, Fn:Alec, EmId:XXXX",,"$3,041.21",Checking Account,xxxxXXXX,Bank of America,6/1/24,6/24/24,XXXXXXXXXX,XXXXXXXXXX,,"BIG COMPANY NAME DES:PAYROLL, LN:STEWART, FN:ALEC, EMID:XXXX",7/1/24,,
6/3/24,"Movement City 1, City 2, XX",,-$98.04,Really Long Credit Card Name,xxxxXXXX,Bank of America,6/1/24,6/3/24,XXXXXXXXXX,XXXXXXXXXXX,,MOVEMENT CITY 1           CITY 2    XX,7/1/24,,
5/23/24,"Sq *coffee Coffee Co, CITY, XX",,-$5.81,Really Long Credit Card Name,xxxxXXXX,Bank of America,5/1/24,5/20/24,XXXXXXXXXX,XXXXXXXXXX,,SQ *COFFEE COFFEE CO      CITY   XX,7/1/24,,
5/2/24,"Kroger #XXXX, City, XX",,-$80.11,Really Long Credit Card Name,xxxxXXXX,Bank of America,5/1/24,4/29/24,XXXXXXXXXX,XXXXXXXXXX,,KROGER #XXXX             CITY    XX,7/1/24,,
simonmichael commented 2 days ago

Your date-format should use %y not %Y (manual -> https://hackage.haskell.org/package/time-1.14/docs/Data-Time-Format.html#v:formatTime). And, avoid same-line comments in rules files, they are currently not supported.

The currency rule is not needed since these Amount values already include a currency symbol.

Simple catchall, get all transactions that match the regex .[cC]offee*, tag the transaction as an excess expense, and have account1 be expenses:foodstuff:coffee This doesn't work.

It's working here..

I would assume this "compounds" with the earlier .[cC]offee* matcher

It doesn't, the last assignment wins (https://hledger.org/dev/hledger.html#how-csv-rules-are-evaluated).

Do these explain what you're seeing ?

alecStewart1 commented 2 days ago

It explains a lot yes.

I have an issue though. None of these work

if %Account ^Checking\sAccount$
   account2 assets:checking
   comment2 bank:my bank 

if %Account ^Some\sstupid\slong\sname\sfor\scredit\scard$
   account2 liabilities:credit
   comment2 bank:my bank

if %Account ^Savings\sAccount$
   account2 liabilities:savings
   comment2 bank:my bank

I'm no regexp wizard, but all of these regexp will work with the actual names of the accounts if I test them out in a Node.js REPL against the actual account names I want to match.

simonmichael commented 2 days ago

Ah, \s is not supported (https://hledger.org/dev/hledger.html#matchers). A space worked for me.

alecStewart1 commented 1 day ago

Ah, yup. That'll do it.

Thanks Simon!

I have a few CSV other rules to iron out and I'll be good to go! Appreciate the help!