CogComp / cogcomp-nlp

CogComp's Natural Language Processing Libraries and Demos: Modules include lemmatizer, ner, pos, prep-srl, quantifier, question type, relation-extraction, similarity, temporal normalizer, tokenizer, transliteration, verb-sense, and more.
http://nlp.cogcomp.org/
Other
471 stars 144 forks source link

Pos: Tagger performance is insufficient on build? #743

Open chrisoutwright opened 3 years ago

chrisoutwright commented 3 years ago

The following test is performed after build for POSAnnotator(); :

    @Test
    public void testAnnotatorDiff() {
        POSAnnotator annotator = new POSAnnotator();

        TextAnnotation record =
                BasicTextAnnotationBuilder.createTextAnnotationFromTokens(refTokens);
        try {
            annotator.getView(record);
        } catch (AnnotatorException e) {
            fail("AnnotatorException thrown!\n" + e.getMessage());
        }

        TokenLabelView view = (TokenLabelView) record.getView(ViewNames.POS);
        if (refTags.size() != view.getNumberOfConstituents()) {
            fail("Number of tokens tagged in annotator does not match actual number of tokens!");
        }
        int correctCounter = 0;
        for (int i = 0; i < refTags.size(); i++) {
            if (view.getLabel(i).equals(refTags.get(i))) {
                correctCounter++;
            }
        }
        double result = ((double) correctCounter) / refTags.size();

        if (result < thresholdAcc) {
            fail("Tagger performance is insufficient: " + "\nProduced: " + result + "\nExpected: "
                    + thresholdAcc);
        }

    }

I get the following error:

java.lang.AssertionError: Tagger performance is insufficient: 
Produced: 0.3541666666666667
Expected: 0.95

I printed out all tags and corresponding pos and it seems to fail in testDiff() at these only: NN VBD shot JJR RBR more Why is the testAnnotatorDiff() failing? There are a lot of (NN) for getLabel(i)! What does testAnnotatorDiff() do differently? I don't get why the annotator.getView(record) is not assigned to any View object ... I just wanted to get the pipeline working but the required components need to build first, which is failing in my case.