Closed lucianojs closed 7 years ago
The loading processes of binary CASes does not allow for controlling what is loaded.
But you can easily remove the annotations yourself. Just create a new component:
import static org.apache.uima.fit.util.JCasUtil.select;
import org.apache.uima.analysis_engine.AnalysisEngineProcessException;
import org.apache.uima.fit.component.JCasAnnotator_ImplBase;
import org.apache.uima.jcas.JCas;
import de.tudarmstadt.ukp.dkpro.core.api.ner.type.NamedEntity;
public class NamedEntityRemover
extends JCasAnnotator_ImplBase
{
@Override
public void process(JCas aJCas) throws AnalysisEngineProcessException
{
select(aJCas, NamedEntity.class).forEach(aJCas::removeFsFromIndexes);
}
}
The above code should work with UIMA 2.10.1, uimaFIT 2.3.0 and Java 8.
Then just add the new component to your pipeline.
If you use the latest version of DKPro Core from this github repo, then you could also find our basic evaluation code useful. E.g. check out OpenNlpNamedEntityRecognizerTrainerTest.java
:
It worked perfectly, thank you.
I need to evaluate the F-measure of my NER model generated by BinaryCAS format annotations, but it is not possible to remove NER tag annotations for comparisons. Like the readNamedEntity parameter of Conll2002Reader.
I tried converting BinaryCAS to ConNLL2002 but realized after conversion that this format does not support multiple annotations in the same token.