Closed mondjef closed 6 years ago
Hi, this is tied to #33 as soon as the PR for this is merged on beancount it should work exactly like you wrote.
the following code is equivalent, but has these advantages:
So in your import config file, you can write...
import sys
from os import path
from beancount.ingest.importers import csv
from beancount.ingest.importers.csv import Col
from smart_importer.predict_postings import PredictPostings
sys.path.insert(0, path.join(path.dirname(__file__)))
class SimpliiImporter(csv.Importer):
'''
Importer for the Simplii bank.
Note: This undecorated class can be regression-tested with
beancount.ingest.regression.compare_sample_files
'''
def __init__(self):
super().__init__(
{Col.DATE: 'Date',
Col.PAYEE: 'Transaction Details',
Col.AMOUNT_DEBIT: 'Funds Out',
Col.AMOUNT_CREDIT: 'Funds In'},
'Assets:Simplii:Chequing-9875',
'CAD',
[
'Filename: .*SIMPLII_.*\.csv',
'Contents:\n.*Date, Transaction Details, Funds Out, Funds In'
]
)
CONFIG = [
PredictPostings(suggest_accounts=False, training_data='myfile.beancount')(SimpliiImporter)(),
]
EDIT: On a further note, as long as pull request #33 is not merged into beancount, the training data must be specified as an argument to the decorator.
perfect! Thanks @tarioch and @johannesjh
ok I got this to work somewhat....
I used @johannesjh reply as a template just changing my path to the training_data. Fava loads my config and identifies the file properly, however it displays the importer as "
I just pulled the latest versions of beancount, fava and smart_importer, but I could not reproduce the issue. My personal config file looks like this. (I only renamed the true bank names, but otherwise copied the code straight out of my actual bookkeeping folder)
./2018.beancount
is my main beancount file
./import.config.py
is my beancount.ingest import configuration:
#!/usr/bin/env python3
"""Import configuration."""
# Insert our custom importers path here.
# (In practice you might just change your PYTHONPATH environment.)
import sys
from os import path
sys.path.insert(0, path.join(path.dirname(__file__)))
from importers import bank1
from importers import bank2
from beancount.ingest import extract
# Setting this variable provides a list of importer instances.
CONFIG = [
bank1.SmartBank1Importer(),
bank2.SmartBank2Importer()
]
# Override the header on extracted text (if desired).
extract.HEADER = ';; -*- mode: org; mode: beancount; coding: utf-8; -*-\n'
./importers
is a python module
./importers/bank1
and importers/bank2
are python modules (only with different bank names). I have defined my importers in the __init__.py
files within these modules. for example:
./importers/bank2/__init__.py
is structured as follows:
class Bank2Importer(importer.ImporterProtocol):
# custom implementation...
@PredictPostings(
training_data=cache.get_file(
os.path.abspath(os.path.join(
os.path.dirname(__file__), '../../2018.beancount'))
),
account='Liabilities:Bank2:Creditcard'
)
class SmartBank2Importer(Bank2Importer):
'''A smart version of the importer.'''
pass
Fava identifies the importer as follows:
File | Importer | Account | |
---|---|---|---|
downloaded.csv | importers.bank2.SmartBank2Importer | Liabilities:Bank2:Creditcard | Extract |
I think I have narrowed it down to the runpy module that is being called by Fava. Using your example from your first post I was defining my custom importer class directly in my importer config file instead as a separate module outside of the import config file. I am in the process of trying to separate it out like you have done in your most recent example but I am getting errors at the moment that I am tracking down.
@tarioch : I think you said you are also applying the decorator as a function call right in your beancount config file, right? Have you not been experiencing @mondjef 's problems with how fava identifies the importer class?
Nope, not getting this, my config looks like this:
sys.path.insert(0, path.join(path.dirname(__file__)))
FooImporter = PredictPostings(suggest_accounts=False)(mt940importer.Importer)
CONFIG = [
FooImporter(),
]
extract.HEADER = ''
ok I am definitely getting warmer...
here is what I have now...works fine with bean-identify and bean-extract for the non-smart importer version, however with the smart version only bean-identify works. bean-extract fails with the following errors.
In addition, I implemented the file_account method which works but I had trouble getting the 'file' variable and thus resorted to using the 'if file:' statement...this would be due to my lack of python abilities and variable scope I think.
bean-extract /beancount/office/example.import /beancount/Downloads/SIMPLII_9 875_2018-04-22.csv -e /beancount/personal.beancount DEBUG:smart_importer.predict_postings:The Decorator was applied to a class. ;; -- mode: org; mode: beancount; coding: utf-8; -- ** /beancount/Downloads/SIMPLII_9875_2018-04-22.csv DEBUG:smart_importer.predict_postings:About to call the importer's extract function to receive entries to be imported... DEBUG:smart_importer.predict_postings:Trying to read the importer's file_account, to be used as default value for the decorator's
account
argument... DEBUG:smart_importer.predict_postings:Read file_account Assets:Simplii:Chequing-9875 from the importer; using it as known account in the decorator. DEBUG:smart_importer.machinelearning_helpers:Reading training data from _FileMemo "/beancount/personal.beancount"... ERROR:root:Importer importers.simplii.SmartSimpliiImporter: "Assets:Simplii".extract() raised an unexpected error: ERROR:root:Traceback: Traceback (most recent call last): File "/usr/local/lib/python3.6/site-packages/beancount/ingest/extract.py", line 176, in extract allow_none_for_tags_and_links=allow_none_for_tags_and_links) File "/usr/local/lib/python3.6/site-packages/beancount/ingest/extract.py", line 70, in extract_from_file new_entries = importer.extract(file, kwargs) File "/usr/local/lib/python3.6/site-packages/smart_importer/predict_postings.py", line 102, in wrapper return decorator.enhance_transactions() File "/usr/local/lib/python3.6/site-packages/smart_importer/predict_postings.py", line 110, in enhance_transactions existing_entries=self.existing_entries) File "/usr/local/lib/python3.6/site-packages/smart_importer/machinelearning_helpers.py", line 37, in load_training_data assert not errors AssertionError
config file
#!/usr/bin/env python3
"""Example import configuration."""
# Insert our custom importers path here.
# (In practice you might just change your PYTHONPATH environment.)
import sys
from os import path
from beancount.ingest import extract
sys.path.insert(0, path.join(path.dirname(__file__)))
from importers import simplii
CONFIG = [
simplii.SmartSimpliiImporter()
]
# Override the header on extracted text (if desired).
extract.HEADER = ';; -*- mode: org; mode: beancount; coding: utf-8; -*-\n'
simplii.Importer init.py file
#!/usr/bin/env python3
from beancount.ingest import extract
from beancount.ingest.importers import csv
from beancount.ingest import cache
from beancount.ingest import regression
import re
from os import path
from smart_importer.predict_postings import PredictPostings
class SimpliiImporter(csv.Importer):
'''
Importer for the Simplii bank.
Note: This undecorated class can be regression-tested with
beancount.ingest.regression.compare_sample_files
'''
config = {csv.Col.DATE: 'Date',
csv.Col.PAYEE: 'Transaction Details',
csv.Col.AMOUNT_DEBIT: 'Funds Out',
csv.Col.AMOUNT_CREDIT: 'Funds In'}
account_map = {'7655':'Chequing-9875'}
def __init__(self, account_map=account_map, base_account='Assets:Simplii'):
super().__init__(
self.config,
None,
'CAD',
['Filename: .*SIMPLII_\d{4}_.*\.csv',
'Contents:\n.*Date, Transaction Details, Funds Out, Funds In'],
institution='Simplii'
),
self.account_map = account_map
self.base_account = base_account
def file_account(self, file):
if file:
m = re.match(r'.+SIMPLII_(\d{4})_.*', file.name)[1]
if m:
sub_account = self.account_map.get(m)
if sub_account:
account = self.base_account + ':' + sub_account
return account
return self.base_account
@PredictPostings(training_data=cache.get_file('/beancount/personal.beancount'))
class SmartSimpliiImporter(SimpliiImporter):
'''
A smart version of the Simplii importer.
'''
pass
I have ironed a few other issues with my modified version of the built-in beancount csv importer, now it works without issue for the non-smart version, however I cannot get the smart version to get past trying to load training data. Keep getting the assertion error previously identified, is this a bug?
never mind...finally got it to work. There is an unused pad entry in my beancount file....error message thrown by smart importer could pass along the error message of beancount to be a bit more intuitive.
I am trying to get this to work with the standard provided csv.importer of beancount without much success. To be honest I am fairly green with Python let alone decorators so I am sure it is something that I am doing or not doing...
Could somebody please point me in the right direction here on what I am doing wrong. I am very interested in this and have some experience with ML and hope to add to this project where I can once I get everything up and running.