Open ericcanadas opened 6 years ago
@EricC91 Yes, it is beneficial to keep the sentence with only "O" labels. Those are so-called negative examples. Having the select negatives in your training set makes your model much more robust to false positives (tagging where the model should not tag).
From my experience building many custom NER models, it is beneficial to add negative examples in small batches. The ones you add in the current iteration are the ones where the model tags. After a couple of iterations on some random examples, your model will learn pretty quickly. The added examples must be diverse.
Hi,
More a question than an issue. Is it useful to leave sentences that contain only O in the training set ? Exemple : (here, the sentence, "The dog is brown")