simonmichael / hledger

Robust, fast, intuitive plain text accounting tool with CLI, TUI and web interfaces.
https://hledger.org
GNU General Public License v3.0
2.91k stars 315 forks source link

integrate ledger4 parser #428

Closed simonmichael closed 1 year ago

simonmichael commented 7 years ago

I have been intending to integrate the parser from @ledger/ledger4 as an additional reader, aiming to improve our support for modern ledger file format and facilitate h/ledger interop & more John W. hacking.

I worked on it today as part of the Ledger hackathon and have pushed basic integration to master. hledger now uses only the ledger4 parser (ledger-parse) for files whose suffix is .ledger or .l. (Just as the hledger journal parser is used for files with suffix .journal, .j or (new) .hledger). As before, files with an unrecognised suffix or no suffix are parsed by each reader in turn until one succeeds. The integration is quite basic; as yet only transaction date/description and posting account/amount are recognised. It should be relatively easy to add to this skeleton to support more of the syntax, and help is welcome.

In theory, this makes us now able to parse more ledger files than before. Actually, some ledger files are parsed better by hledger's journal parser than the ledger4 parser (eg: https://github.com/ledger/ledger4/issues/6). So I'm not sure which is the better short term strategy for supporting more ledger files: add the missing constructs to the hledger parser, as we've done in the past, or bring the ledger4 parser up to par. My feeling is that separate specialised parsers can be more useful in the long term, but that I personally should focus on the hledger parser while others develop the ledger4 parser further.

--- Want to back this issue? **[Post a bounty on it!](https://www.bountysource.com/issues/39185844-integrate-ledger4-parser?utm_campaign=plugin&utm_content=tracker%2F536505&utm_medium=issues&utm_source=github)** We accept bounties via [Bountysource](https://www.bountysource.com/?utm_campaign=plugin&utm_content=tracker%2F536505&utm_medium=issues&utm_source=github).
abourget commented 7 years ago

What is the reference for parsing Ledger files ? I see this file in the ledger repo: https://github.com/ledger/ledger/blob/next/doc/grammar.y but I see that there are weirdness around the lot_date_opt identifier, referring to date instead of lot_date where lot_date follows.

I'm trying to implement a parser in Go, I'd like it to be able to read "standard" ledger files, and build a full AST, for modifying files, but also eventually interpreting it and listing register and balances.

simonmichael commented 7 years ago

I don't really know what is the best reference for Ledger's format. I usually start with the manual but test everything. By the way, have you seen https://github.com/howeyc/ledger ?

simonmichael commented 7 years ago

There's also http://plaintextaccounting.org/quickref/ which is not authoritative but has some info and some links.

simonmichael commented 7 years ago

Back to the topic at hand:

hledger now uses only the ledger4 parser (ledger-parse) for files whose suffix is .ledger or .l. (Just as the hledger journal parser is used for files with suffix .journal, .j or (new) .hledger). As before, files with an unrecognised suffix or no suffix are parsed by each reader in turn until one succeeds.

Update: at present the ledger4 parser is never used automatically, you must prepend a ledger: prefix to the file path to activate it.

hledger -f ledger:t.ledger print --debug=1

abourget commented 7 years ago

Is the code here https://github.com/ledger/ledger4/blob/master/ledger-parse/Ledger/Parser/Text.hs all that is needed to parse ledger files?! Or is there an implementation somewhere else?

simonmichael commented 7 years ago

Yes, or more precisely, our copy of it.. but also no, because it has not been battle tested and for example it doesn't parse the amounts and prices, nor does it "cook" or apply the usual ledger-ish parsing semantics to the raw constructs. You can see some of that happening in https://github.com/simonmichael/hledger/blob/master/hledger-lib/Hledger/Read/LedgerReader.hs.

simonmichael commented 1 year ago

This parser never advanced past the prototype stage, and was later removed from hledger. Current strategy is to make the main journal parser more capable. Closing.