Closed alekaizer closed 6 years ago
See here for what might be going on: https://explosion.ai/blog/pseudo-rehearsal-catastrophic-forgetting . The next version of the docs references this more explicitly.
Try making fewer iterations, too --- you're probably overfitting on those examples.
thx @honnibal , variating the iterations output different answers:
800 Ents in 'I need to go to Facebook, and Google tomorrow': CURSE I need to DATE tomorrow
Ents in 'there is something on the 23/10/2018': WHQ there is DATE 23102018
Ents in 'Georges Bush is the 43th president of United States of America': PERSON Georges Bush GPE of United States of
1000 iterations Ents in 'I need to go to Facebook, and Google tomorrow': CURSE I need to DATE tomorrow
Ents in 'there is something on the 23/10/2018': WHQ there is DATE 23102018
Ents in 'Georges Bush is the 43th president of United States of America': PERSON Georges CURSE 43th GPE of United States of
1500 Ents in 'I need to go to Facebook, and Google tomorrow': CURSE I need to DATE tomorrow
Ents in 'there is something on the 23/10/2018': WHQ there is CURSE 23102018
Ents in 'Georges Bush is the 43th president of United States of America': PERSON Georges Bush CURSE 43th GPE of United States of
2000 Ents in 'I need to go to Facebook, and Google tomorrow': CURSE I need to DATE tomorrow
Ents in 'there is something on the 23/10/2018': WHQ there is CURSE 23102018
Ents in 'Georges Bush is the 43th president of United States of America': PERSON Georges PERSON Bush CURSE 43th
Without adding the new entities, the output is correct:
Ents in 'I need to go to Facebook, and Google tomorrow':
GPE Facebook
ORG Google
DATE tomorrow
Ents in 'there is something on the 23/10/2018':
DATE 23102018
Ents in 'Georges Bush is the 43th president of United States of America':
PERSON Georges Bush
GPE United States of America
Is there a way to calculate the right iteration number ? and also seems like adding the new entities break some stuffs, for example GPE is incomplete when the entities are added, etc...
Please see here --- the training is much improved in v2, and we've tried to give a lot more guidance about how to make good use of it: https://spacy.io/usage/training
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
I followed the entity adding tutorial on the spacy website, and after adding custom entities, when I run the nlp.doc against a test data, the predefined entities like: ORG, DATE, etc... are not detected anymore, and those entities are mistaken for new entities.
Here is the code:
and here is the output:
Process finished with exit code 0
Isn't the entities supposed to be added to the existing entities and not erase them ?
My Environment
Info about spaCy
Installed models: cache, en, en-1.1.0, en_glove_cc_300_1m_vectors-1.0.0
spaCy version: 1.9.0
Platform: Darwin-16.7.0-x86_64-i386-64bit
Python version: 3.5.3
Operating System: Mac OS
Python Version Used: 3.5.2