Closed mondjef closed 1 year ago
Currently working with this config file....
import sys
from os import path
from beancount.ingest import extract
sys.path.insert(0, path.join(path.dirname(__file__)))
from importers import simplii
from FingerPrintDuplicatesComparator import DuplicatesComparator
from smart_importer import apply_hooks, PredictPayees, PredictPostings
from smart_importer.detector import DuplicateDetector
simplii = simplii.SimpliiImporter()
apply_hooks(simplii,[PredictPostings(), PredictPayees(), DuplicateDetector(comparator=DuplicatesComparator())])
CONFIG = [
simplii
]
hi, existing entries can be specified as training data when calling bean-extract, see https://github.com/beancount/smart_importer#specifying-training-data ...does this help with what you want do achieve?
hi, existing entries can be specified as training data when calling bean-extract, see https://github.com/beancount/smart_importer#specifying-training-data ...does this help with what you want do achieve?
hi, not really.... I have read that but cannot understand where and how to feed the training data into the importer decorated by the smart import hooks...i.e. what do I need to do, where, and what parameter is used to feed it in. I have tried 'training_data' and 'existing_entries' without success in many places.
bean-extract....this is a command line tool no? How can this be integrated into my python smart importer? Is the workflow more like....use bean-extract tool to read and process existing transactions then to take its output and some how use in smart importer? As you can see I am a bit lost and confused with this part....
Some explanations about how the tools interact:
bean-extract is beancount's commandline tool for importing transactions from csv or other sources. when using this tool, you can specify the --existing <BEANCOUNT_FILE>
argument. the existing entries from this file are then used in the entire import process. this works out-of-the-box, no need to configure/program anyting regarding existing entries in your import config file. the flow is as follows:
--existing
argument.You can alternatively use fava instead of bean-extract. fava assumes that the file you are viewing/editing already contains the existing entries. the flow is otherwise very similar to bean-extract:
insert-entry
option, see fava's import help page and some more explanation in fava #1262.A suggestion regarding your file structure, my setup works like this, and it might also work for you.
I have the following file and folder structure:
main.beancount (this is the overall main beancount file, it includes each and every year)
2022/2022.beancount (this is the main file for year 2022, obviously... it may include further sub-files)
2023/2023.beancount (this is the main file for year 2023)
And I use smart_importer together with fava like this:
my file/folder setup is similar to yours with the exception that my main.beancount file points to a single fiscal file and I have an alternative main.beancount file that is exactly like yours which includes all prior fiscal files (or selected ones) that I want to use for ML training so that I can filter/restrict what is feed as training.
Ok, this is a bit more clear now to me....as I use fava and want to have everything remain in the work flow that fava uses for importing I would need a way to ignore what fava passes as 'existing entries' and replace with want I want. My guess I would need to do this somewhere in my importer itself?
My guess I would need to do this somewhere in my importer itself?
maybe... but you are leaving the paths of what seems to be the default / recommended usage of fava, so this will certainly take some tinkering.
...as I use fava and want to have everything remain in the work flow that fava uses for importing...
Yes, keeping everything aligned with fava's intended workflow is exactly what I would recommend to you as well. Taking this philosophy one step further: Simply open your main.beancount file (which includes all other files) in fava. Problem solved, no need to tinker. :-)
Let's close this ticket. (I don't think the smart_importer project can or should change for what we've been discussing in this ticket).
PS just to make sure, I guess (I hope) you are aware of the fact that fava allows you to edit any file referenced by the main file? the file chooser dropdown is in the top left corner:
I was not aware of the file chooser in fava and I don't seem to have this in my fava instance even though I have include statements that reference other files in the beancount that I am currently pointing fava at. Is there anything special that needs to be done to have this option show in fava? In light of this option I will have to rethink a bit my strategy to avoid tinkering with fava as little as possible.
edit: from what I can tell there might be a short coming of the docker version of fava and the environment variable that is passed in at time of container creation to indicate the bean file. I can get the file chooser to show up if I supply multiple files in this environment variable separated by a comma and space (does not work without the space) and surrounded by quotes. However, it does not work well...it appears as though the files are just being clobbered dependent on the order so I am not sure the docker image supports this feature.
right, this was an older screenshot taken from the internet, sorry.
I tried again using a recent version of fava on my local machine... it looks like the file chooser has moved to the editor's file menu, as in this screenshot:
I have smart importers setup, decorated and working...however I recently changed my beancount file structure from a single file to multiple files by year for a number of reasons. As such, I have a 'main' beancount file that I point fava at that includes some global files along with the current year beancount file where new entries are entered. At the beginning of every year I don't want my smart importer training on no or limited data but instead want to point the smart importer to an alternative beancount file that includes all historical entries for training.
I have poured over the code and tried all sorts of things to pass in this alternative training file without luck...I can't even get it to try to load it as any parameter I have tried thus far results in an unexpected argument error being raised so I know I am not going it right.
I saw issue #61 that also discusses this but still could not connect the dots.
Can someone please provide more guidance as what needs to be done or provide a bit more info in the documentation around this.