Closed Horsmann closed 5 years ago
IMHO the classifier needs to decide what to do.
In the OpenNLP POS recommender, we mark not-annotated words using a special label during training. When predicting, we drop all the predictions of that special label. This allows us to start training even on incompletely annotated sentences. Yes, we might get a class bias, but for the moment, we are willing to try it out. We might add a configuration option (trait) specifically for the OpenNLP POS recommender to configure whether to train only on fully annotated sentences or also in incomplete sentences, maybe even allowing to configure some threshold percentage of annotated tokens at which to consider a sentence sufficiently annotated.
In the OpenNLP NER recommender, we do not do anything special - we leave the feature encoding to OpenNLP. OpenNLP NER internally uses a BIO or similar encoding to mark tokens that are (not) annotated. When predicting, it also internally reverses that encoding and provides us with (multi-token) spans and their corresponding labels (excluding any of the "out" labels generated by the BIO-like encoding).
IMHO the classifier needs to decide what to do. the nature of the problem decides what is more reasonable to do. You want helpful predictions very soon.
NER has a few tags and assuming that NERs repeat themself in a text after some time you will get some useful predictions. The majority being "no-class" is not a problem then. In particular, "no-class" is the right class for the most words in a text. Here you want to train on a "no-class" class. Otherwise every word will get a NER-tag prediction that fights the user and does not help him.
In PoS you have 12 between 5x tags for the most cases. Every word has one of these pre-defined classes. No class is not a valid class. Furthermore, it might take some time until words are so frequent that they beat the "no-class" bias learned from the unlabeled data.
What is right/better depends on the task, I can of course hard-code something to capture POS and NER as frequent tasks but in general, no single solutions fits all. It makes the most sense to give the recommender a hint if "no-class" is actually a valid class in a task or not.
It makes the most sense to give the recommender a hint if "no-class" is actually a valid class in a task or not.
When we should IMHO add a trait to the external recommender where the user can configure this and which gets transmitted to the TC side in some parameter section of the request. @Rentier WDYT?
That sounds reasonable. We also need to consider whether we split up the external recommender here, e.g. add an DkProTcExternalRecommender, as not all external recommenders make use of these meta labels. We could maybe just add a factory for that. Then we dump certain traits in the request by default.
Sounds good, just a piece of info in the request (boolean saying no-class is valid class or not) will do then I can react on my side.
@Rentier so we will rename the current external recommender to "Generic external recommender" and then introduce a "DKPro TC external recommender" in addition?
I am wondering why this flag shouldn't be a default information that is served to all recommender? Like explained above, using 'no-class' incorrectly might start to fight the user by reproducing either no prediction at all or tons of wrong predictions.
I don't now what other recommenders you have at the moment but they should have the same problems?
@Horsmann e.g. for NER-like recommenders, we don't really need it.
Side question: what about the case where there is an annotation (e.g. a POS) but it has no label? (cf. https://github.com/inception-project/inception/issues/325)
@Horsmann e.g. for NER-like recommenders, we don't really need it.
Well, if the neural net(?) predicts NER this is true. I was using NER so far and just switched for testing to POS and noted that we are actually strongly limiting the usefulness of the recommender by neglecting the meaning of no-label
Side question: what about the case where there is an annotation (e.g. a POS) but it has no label? (cf. inception-project/inception#325)
I am not overwriting existing user annotations. I skip cases without value otherwise I get null
values in the backend (so, the user created an annotation and hasn't assigned a value yet but the training is triggered then the reference value is null).
Could you prepare a dummy request that I could use for testing? Then I would have a look.
I created an issue for per external recommender configuration in INCEpTION.
How would you call that flag? I had some ideas, but I do not like them that much:
useFallbackLabel
useDefaultLabel
padIfNoLabel
useDummyLabel
needsDummyLabel
We could also just specify what kind of recommender TC should use for us in a field, e.g. add a classifierType
to the request. Tbh, I like that more.
Btw, can TC do sentence level recommendations? If yes, then I would open an issue here so I do not forget that.
@Rentier Yes, this should work in principle but will probably require more information in the request :) - at the moment it is silently assumed that the classification target is a word/token; this should be communicated in the request accordingly. I will have to setup things differently in the backend for sentence-classification. We also can do document or multi-sentence...
How about isNoLabelValidValue
this would be yes
for NER and false
for POS.
We already send you the granularity (e.g. token, span, sentence). Can it be said that in general for token recommendation that no label is invalid?
Ah, ok, then I would just need a sample request and I can add sentence prediction :)
NER is Token level but no Label is for the most words the right/valid answer? So no, depending on the task it might be valid.
NER for us is span, POS token.
For now, I think we can use the granularity in the metadata to fix this bug. The values in the metadata->anchoringMode
are:
characters
singleToken
tokens
sentences
We support with your DkPro TC recommender 2 (isNoLabelValidValue = false) and 3, (isNoLabelValidValue = true)
@reckart Wdyt?
Sounds good. Token is then no label is invalid
und span ist valid
. I will have to refactor some code. At the moment, I think two "nouns" would be merged because the two notions of token/span are a bit intermixed.
IMHO the following makes sense:
WDYT?
Sounds good but why is sentence multi-label? Isn't that yet again another sub-case of sentence classification? e.g. sentiment
would be a normal single-label tasks that could be done on sentence level? (pos|neg|neutral) as labels.
sorry, I'm always mixing up multi-label and multi-class. Sentence classification would be multi-class, not binary.
In fact, the recommender framework in INCEpTION currently only supports single-label classifiers. If a layer has multiple features (like person, number, etc. on a morphology layers), then separate recommenders need to be configured for each of them and each feature is predicted individually. Supporting multi-label classifiers could be a future extension.
@reckart @Rentier
I tested around with POS and NER predictions and this issue of missing annotation came up again.
POS: If I just ignore not annotated words I have no problems. If I consider them and assign them a label for
not labeled yet
, the amount of not annotated token will outweigh the actual annotated data, i.e. all predictions will be the dummy-label for its extremely high frequency weight.NER: If I do not annotate words without annotation each word will receive a NER-tag. The notion of "no-tag" is missing in the training data.
Thus, I think we need a flag to indicate what to do with not annotated data. Depending on the task, ignoring them makes the predictions useless.