lspecia / quest

Pascal2 Harvest project QuEst
14 stars 15 forks source link

Allow feature extraction to bypass tokenizer and/or truecaser #6

Open lefterav opened 11 years ago

lefterav commented 11 years ago

In certain cases, extraction for some features requires that no tokenization or truecasing is done before (e.g. because the external script performs its own tokenization that matches its own models). In the current implementation, the execution of tokenizer and truecaser is hardcoded before the execution of all other processing/extraction.

The best way to change this, would be to devise a dependency list of pre-processors for each feature. Then, each feature extractor would be given the outcome of pre-processing that it has specified

Shireen35 commented 3 years ago

Has anything been done for this?