winkjs / wink-eng-lite-web-model

English lite language model for Web Browsers
MIT License
11 stars 9 forks source link

"banishes" detected as NOUN rather than VERB #9

Closed bboyle closed 1 year ago

bboyle commented 1 year ago

The POS tagger is detecting "banishes" as a noun instead of a verb. Not sure if this is a gap in the english model, some other bug, or something I've coded wrong. Here's the test case I put together:

import winkNLP from 'wink-nlp';
import model from 'wink-eng-lite-web-model';

describe('winkNLP', () => {
  const nlp = winkNLP(model);
  const its = nlp.its;

  it('detects banishes as a verb', () => {
    const text = 'Prince Escalus banishes Romeo.';
    const doc = nlp.readDoc(text);
    const tokens = doc.tokens();
    expect(tokens.out(its.pos)).toEqual(['PROPN', 'PROPN', 'VERB', 'PROPN', 'PUNCT']);
  });
});

image

sanjayaksaxena commented 1 year ago

Hello @bboyle

Thanks for highlighting the issue. The POS tagger accuracy, like any other tagger, is not 100% and therefore there is always a likelihood of a few incorrect tags.

We will surely look into the issue, but it may take a bit longer as it involves retraining & revalidation.

We would also request you to consider using the web model instead of this one as we will not be supporting this in future.

Best, Sanjaya

bboyle commented 1 year ago

Thanks Sanjaya. I am using the web model, but apologies for opening the issue on the wrong repo. Interested to hear what you may find out with retraining — I found several similar verbs that were not detected which was pretty confusing. Wink is amazing otherwise :)

sanjayaksaxena commented 1 year ago

Thank you @bboyle for you contribution; we will initiate the work soon and keep you updated here. Moving this issue to the web model's repo.

rachnachakraborty commented 1 year ago

Hello @bboyle

We have initiated work on the web model to address incorrect tagging of verbs.

Shall keep you posted.

Thanks, Rachna

sanjayaksaxena commented 1 year ago

Hello @bboyle

Have released the updated model (version 1.5.0) on NPM.

Best, Sanjaya