In certain cases, extraction for some features requires that no tokenization or truecasing is done before (e.g. because the external script performs its own tokenization that matches its own models). In the current implementation, the execution of tokenizer and truecaser is hardcoded before the execution of all other processing/extraction.
The best way to change this, would be to devise a dependency list of pre-processors for each feature. Then, each feature extractor would be given the outcome of pre-processing that it has specified
In certain cases, extraction for some features requires that no tokenization or truecasing is done before (e.g. because the external script performs its own tokenization that matches its own models). In the current implementation, the execution of tokenizer and truecaser is hardcoded before the execution of all other processing/extraction.
The best way to change this, would be to devise a dependency list of pre-processors for each feature. Then, each feature extractor would be given the outcome of pre-processing that it has specified