manishrjain / into-ledger

Most efficient and accurate tool to categorize and import expenses into ledger
MIT License
107 stars 19 forks source link

More descriptive errors #6

Closed dudelson closed 7 years ago

dudelson commented 7 years ago

Thanks for this tool! It's been making using ledger a lot easier. Although I think it's easy to use once you have it set up, getting it set up was a bit of a pain because there were several points where I couldn't figure out what I was doing wrong. I encountered the following issues during configuration of into-ledger, and I think the error messages could do a better job of helping the user along in each of these cases:

1.

$ into-ledger -a citizens -csv ~/Downloads/EXPORT.CSV
Opening file: /home/david/.into-ledger/shortcuts.yaml for reading key mappings
No config found in /home/david/.into-ledger/config.yaml. Using all flags
  -a string
        Name of bank account transactions belong to.
  -c string
        Set currency if any.
  -conf string
        Config directory to store various into-ledger configs in. (default "/home/david/.into-ledger")
  -csv string
        File path of CSV file containing new transactions.
  -d string
        Express your date format in numeric form w.r.t. Jan 02, 2006, separated by slashes (/). See: https://golang.org/pkg/time/ (default "01/02/2006")
  -debug
        Additional debug information if set.
  -ic string
        Comma separated list of columns to ignore in CSV.
  -j string
        Existing journal to learn from.
  -o string
        Journal file to write to. (default "out.ldg")
  -s int
        Number of header lines in CSV to skip

    ERROR: Please specify the input ledger journal file 

Unchanged keyboard shortcuts. Skipping overwrite to /home/david/.into-ledger/shortcuts.yaml.

This was confusing because I had created /home/david/.into-ledger/config.yaml, but for some reason into-ledger was saying it couldn't find it. The problem turned out to be that I wrote "accouts" instead of "accounts" in config.yaml. An error like "Unrecognized key 'accouts' in config.yaml" would have made it immediately obvious what the problem was.

  1. 
    $ into-ledger -a citizens -csv ~/Downloads/EXPORT.CSV
    Opening file: /home/david/.into-ledger/shortcuts.yaml for reading key mappings
    Using config: {Currency:USD Journal:/home/david/Documents/finance/y2012-2016.txt DateFormat:1/2/16 Ignore:0,2,5,6,7 Output:/home/david/citizens.out Skip:1}
    2016/12/30 23:24:13 %!(EXTRA []interface {}=[])
    2016/12/30 23:24:13 
    2016/12/30 23:24:13 Unable to parse txn for [DIRECT DEBIT 12/23/16 Checking JCPENNEY 2416     CONCORD      NH            1216 -350.00    $350.00]
    . Got: {Date:0001-01-01 00:00:00 +0000 UTC Desc:JCPENNEY 2416     CONCORD      NH            1216 To: From: Cur:-350 CurName: Key:[59 22 128 102 70 48 233 220 134 3 63 100 103 90 221 81] skipClassification:false}

main.check /home/david/.go/src/github.com/manishrjain/into-ledger/main.go:85 main.parseTransactionsFromCSV /home/david/.go/src/github.com/manishrjain/into-ledger/main.go:336 main.main /home/david/.go/src/github.com/manishrjain/into-ledger/main.go:793 runtime.main /usr/lib/go/src/runtime/proc.go:183 runtime.goexit /usr/lib/go/src/runtime/asm_amd64.s:2086

The problem here was that I had accidentally written "1/2/16" in config.yaml instead of "1/2/06" (as an aside, why does into-ledger require that the date format be specified WRT 1/2/2006? I don't know golang, is something like "%m/%d/%y" not straightforward to implement?), but the error seems to imply that the CSV file is mal-formatted. Would it be possible to instead generate an error like "Invalid dateformat for 'citizens' account in config.yaml"?

3.

$ into-ledger -a citizens -csv ~/Downloads/EXPORT_edit.CSV Opening file: /home/david/.into-ledger/shortcuts.yaml for reading key mappings Using config: {Currency:USD Journal:/home/david/Documents/finance/y2012-2016.txt DateFormat:1/2/06 Ignore:0,2,5,6,7 Output:/home/david/citizens.out Skip:1}

Unchanged keyboard shortcuts. Skipping overwrite to /home/david/.into-ledger/shortcuts.yaml. panic: provide at least two classes

goroutine 1 [running]: panic(0x57f7c0, 0xc42011ec10) /usr/lib/go/src/runtime/panic.go:500 +0x1a1 github.com/jbrukh/bayesian.NewClassifierTfIdf(0xc4201060a0, 0x1, 0xa, 0x0) /home/david/.go/src/github.com/jbrukh/bayesian/bayesian.go:158 +0x4ff main.(*parser).generateClasses(0xc42003bd88) /home/david/.go/src/github.com/manishrjain/into-ledger/main.go:185 +0x30c main.main() /home/david/.go/src/github.com/manishrjain/into-ledger/main.go:789 +0xd23


This one still confuses me. After using the stack trace to debug this crash, it seems like when learning from an existing ledger file, into-ledger won't use accounts under "Equity" or "Assets" as input to the Bayesian classifier. It just so happens that the ledger file I was using for input only uses two accounts, one of which is under "Expenses", and one which is under "Assets". It appears that the Bayesian classifier requires at least two classes, but in this case it was being passed only one (the one account in my file under "Expenses"), so the whole thing crashed. I managed to get into-ledger to learn from my ledger file by adding a bogus transaction with a bogus Expenses account, but I do not understand why I got this error to begin with. Maybe it's as simple as changing the assertion on line 183 from "> 0" to "> 1" (in which case I would suggest the failure message be changed to something like "into-ledger requires at least two unique accounts to classify"), but in the first place I don't get why into-ledger doesn't input Asset and Equity accounts into the classifier.
manishrjain commented 7 years ago
  1. That's hard to detect. YAML parsing isn't strict. If some field is present, which isn't required by the corresponding Go struct, that's just ignored. So, there's no way to detect the absence of an account from an erroneous config file. However, to make things a bit clearer, I have updated the message, highlighted it with red background, and specified which account it can't find.

  2. When parsing the CSV, it's not known which column corresponds to the date. The program tries to parse every column as a date, and if it's successful that column gets picked up. And if no date has been picked up by the time all the columns are parsed, then it throws an error. Without knowing in advance which column corresponds to the date, it's hard to specifically throw an error about invalid date format.

To make things clear, I've added a section in README about common date formats used. So, users can just copy paste the relevant format, without needed to understand Go.

  1. The assets (for bank accounts) and liabilities (for credit cards) etc. account is the one being passed to into-ledger via the -a flag. This makes sense, because the CSV files are downloaded from bank accounts or credit card statements, and then input to ledger. At any point, you're only working with one CSV file from one account.

The classifier is built to aid in categorizing the expenses; which is the hard part. To classify, it must have at least 2 classes; which makes sense. And, most valid use cases of ledger would have at least a few expense categories; without which even ledger wouldn't be very helpful. So, I'd say the thing is working as expected.

manishrjain commented 7 years ago

Haven't had any updates on this. Closing the thread; if you need any further action, feel free to reopen.