When fitting the CRFSlotFiller on a intent, dropout is set once for all entities. However in practice we'd like to have a high dropout for automatically_extensible entities (since at inference time, we'll try to parse unseen entity values) and a lower dropout for non automatically_extensible entities (since we've seen all of them at training time)
Work done
Added the ability to filter entity to which the CustomEntityMatchFactory is applied.
This entity_filter argument is passed as args in the CustomEntityMatchFactory configuration.
Currently only one flag can be set in the filter: automatically_extensible:
if True then the feature will be computed only for automatically_extensible entities
if False the the feature will be computed only for non automatically_extensible entities
Updated the default language configuration accordingly.
Checklist:
[x] My PR is ready for code review
[x] I have added some tests, if applicable, and run the whole test suite, including linting tests
[x] I have updated the documentation, if applicable
Description:
Initial use case
When fitting the
CRFSlotFiller
on a intent, dropout is set once for all entities. However in practice we'd like to have a high dropout forautomatically_extensible
entities (since at inference time, we'll try to parse unseen entity values) and a lower dropout for nonautomatically_extensible
entities (since we've seen all of them at training time)Work done
Added the ability to filter entity to which the
CustomEntityMatchFactory
is applied. Thisentity_filter
argument is passed asargs
in theCustomEntityMatchFactory
configuration.Currently only one flag can be set in the filter:
automatically_extensible
:True
then the feature will be computed only forautomatically_extensible
entitiesFalse
the the feature will be computed only for nonautomatically_extensible
entitiesUpdated the default language configuration accordingly.
Checklist: