simonmichael / hledger

Robust, fast, intuitive plain text accounting tool with CLI, TUI and web interfaces.
https://hledger.org
GNU General Public License v3.0
2.91k stars 315 forks source link

Parser issues with ssv file date formats #1179

Closed dwoffinden closed 4 years ago

dwoffinden commented 4 years ago

I managed to get an export from Revolut, the format is quite strange and hledger doesn't seem to like it. It's semicolon-separated fields but with dates like "Dec 21, 2019"; my first couple of lines:

Completed Date ; Description ; Paid Out (GBP) ; Paid In (GBP) ; Exchange Out; Exchange In; Balance (GBP); Category; Notes
Dec 21, 2019 ; To Daniel Woffinden  ; 1.00 ;  ;  ;  ; 0.00; Transfers;
Dec 19, 2019 ; To Daniel Woffinden  ; 4.00 ;  ;  ;  ; 1.00; Transfers;

I tried to name this revolut.ssv and create a revolut.ssv.rules file like so:

separator ;
skip 1
fields date, description, outgbp, ingbp, fxout, fxin, balance1, category, notes
date-format %b %d, %Y

But hledger -f revolut.ssv print gives me:

hledger: /home/daw/accounts/import/revolut.ssv:1:2:
  |
1 | Completed Date ; Description ; Paid Out (GBP) ; Paid In (GBP) ; Exchange Out; Exchange In; Balance (GBP); Category; Notes
  |  ^
unexpected 'o'

If I remove the header line from the ssv, and the skip 1 line from the rules file, I instead get:

hledger: /home/daw/accounts/import/revolut.ssv:1:2:
  |
1 | Dec 21, 2019 ; To Daniel Woffinden  ; 1.00 ;  ;  ;  ; 0.00; Transfers;
  |  ^
unexpected 'e'

So I suspect something in hledger's parser, earlier than the date parser, really doesn't like semicolon-separated files where the first field is a date in such a format. As far as I can tell the format is correct:

Prelude Data.Time.Calendar Data.Time.Format> parseTimeM True defaultTimeLocale "%b %d, %Y" "Dec 21, 2019" :: Maybe Day
Just 2019-12-21

Manually removing the commas and header wasn't enough to appease hledger, I had to actually reformat the dates to ISO in a text editor before it'd accept the file :/

Some details that may be helpful to include:

https://hledger.org/csv.html#date-format, https://hledger.org/csv.html#separator

> hledger --version
hledger 1.16.1

stack install --resolver=lts hledger-lib-1.16.1 hledger-1.16.1 hledger-ui-1.16.1 hledger-web-1.16.1

Linux: Debian buster container inside ChromeOS 79

dwoffinden commented 4 years ago

Ok, to get what I wanted out of this file I had to pretty heavily massage it in a text editor until I basically had a csv with no whitespace padding, so it's entirely possible I was doing something wrong here, or ssv parsing doesn't work as I thought it did..

simonmichael commented 4 years ago

Thanks for the report @dwoffinden. What's your hledger --version ? separator is a command-line flag in hledger 1.16.2, a CSV rule in hledger 1.16.99 (latest master).

simonmichael commented 4 years ago

Also: when troubleshooting see if adding a csv: prefix to the file path makes any difference: hledger -f csv:revolut.ssv .... This forces hledger to use and show the error message from the CSV reader only (happens automatically for .csv but not yet for .ssv file extension).

simonmichael commented 4 years ago

Yes, I forgot to use your filename so didn't reproduce it the first time. The csv: prefix helps (without it, hledger tries all the readers and returns the error from the first, ie the journal reader):

~/src/hledger$ hledger-1.16.2 -f revolut.ssv print
hledger-1.16.2: /Users/simon/src/PLAINTEXTACCOUNTING/hledger/revolut.ssv:1:2:
  |
1 | Completed Date ; Description ; Paid Out (GBP) ; Paid In (GBP) ; Exchange Out; Exchange In; Balance (GBP); Category; Notes
  |  ^                                                                                                                                                                               unexpected 'o'

~/src/hledger$ hledger-1.16.2 -f csv:revolut.ssv print
hledger-1.16.2: user error (/Users/simon/src/PLAINTEXTACCOUNTING/hledger/revolut.ssv.rules:1:1:
  |
1 | separator ;
  | ^
unexpected 's'
expecting blank or comment line, conditional block, directive, end of input, field assignment, or field name list
)
simonmichael commented 4 years ago

.tsv and .ssv are recognised in master now, and https://hledger.org/csv.html#file-extension docs improved.

dwoffinden commented 4 years ago

Thanks! That makes some sense. It seems I need a hledger upgrade :)

With 1.16.2, hledger --separator ';' -f csv:revolut.ssv print seems to work as expected, including parsing all of the fields and the annoying date format.

Thanks again :)