Closed shaswat-indian closed 3 years ago
Hey @shaswat-indian
Why do you want to use this component multiple times in a single pipeline? I'm not sure that I understand your usecase.
After looking at your commit, I think I now understand what you are trying to achieve. You want to include composite entities in patterns of other composite entities.
Let's say you have a composite pattern C1 = @A + @B
and another composite pattern C2 = @C1 + @B
. If you have an utterance A + B + B
, a first pass of the component would yield C1 + B
and then a second pass would yield C2
. Is that right?
To be honest, I don't like the fact that you are using multiple instances of the component to achieve this. Instead, this could be implement by just reapplying the component logic as long as something has changed, i.e. a pattern has matched. You could probably get away with a simple while True
loop that breaks after no change has been detected. Benefits of this approach would be:
Would this be sufficient to solve your problem, or am I missing something that would require to actually use multiple instances of the component?
The solution you provided seems to work for simple use cases like you have provided. Consider a use case as:-
{
"composite_entities": [
{
"name": "C1",
"patterns": [
"@A @B"
]
},
{
"name": "C2",
"patterns": [
"(@A)? @B"
]
}
]
}
In this case, for some input text having pattern "@A @B"
, we get both C1 as "@A @B"
as well as C2 as " @B"
, though I don't expect " @B"
to be detected as a part of C2 as it already is a part of C1 which as per my preference should be higher priority.
This can be resolved if we have two separate files with the first instance of CompositeEntityExtractor having the first file as input and the second with the latter one.
{
"composite_entities": [
{
"name": "C1",
"patterns": [
"@A @B"
]
}
]
}
{
"composite_entities": [
{
"name": "C2",
"patterns": [
"(@A)? @B"
]
}
]
}
This may seem to be a very trivial example which could be resolved with some clever regexes but as the number of entities in the patterns increase, the complexity grows very much.
Coming to the multilevel entity hierarchy thing, if we can have multiple instances of the CompositeEntityExtractor
, we could support hierarchical entity structure in Rasa, which is something provided in other NLP platforms like Dialogflow.
I'm gonna close this for now, as there was no new activity and I'm not sure whether this is still relevant for anyone.
Currently the
CompositeEntityExtractor
can be used multiple times in the rasa configuration pipeline but only the patterns in the last instance are saved because the metadata for all others get overwritten in the defaultcomposite_entities.json
file. We can overcome this by taking incomposite_entities_file
from the configuration(.yml) as the file name to store the metadata for each instance of theCompositeEntityExtractor
component. Alternatively, we can generate a random file name each time the component is initialised.