Open yatin-rxlogix opened 4 years ago
When you create you own dataset in "/path/to/your/P/folder" Execution of:
build_extraction_dataset(os.path.join(settings.BASE_DIR, 'data', 'P'),
os.path.join(get_python_lib(), 'talon/signature/data/train.data'))
build_extraction_dataset change file 'talon/signature/data/train.data' with you "/path/to/your/P/folder" data
Then you train classifier with new 'talon/signature/data/train.data':
c.train(c.init(), os.path.join(get_python_lib(), 'talon/signature/data/train.data'),
os.path.join(get_python_lib(), 'talon/signature/data/classifier'))
execution of this code change 'talon/signature/data/classifier'
When you call talon.init() it execute:
def init():
register_xpath_extensions()
if ML_ENABLED:
signature.initialize()
signature.initialize() call:
EXTRACTOR_FILENAME = os.path.join(DATA_DIR, 'classifier')
EXTRACTOR_DATA = os.path.join(DATA_DIR, 'train.data')
def initialize():
extraction.EXTRACTOR = classifier.load(EXTRACTOR_FILENAME,
EXTRACTOR_DATA)
in extraction.py in _mark_lines call EXTRACTOR as classifier in is_signature_line
So, after train classifier EXTRACTOR_DATA and EXTRACTOR_FILENAME already have get you email raw data with #sig#. And after call talon.init() you use your training classifier
I have used following statement to train a Classifier on my Custom Data Set but I am not able to use this Custom Classifier for Signature Extraction. Can somebody help in this issue as where am I doing wrong.