snipsco / snips-nlu

Snips Python library to extract meaning from text
https://snips-nlu.readthedocs.io
Apache License 2.0
3.89k stars 513 forks source link

Parse function should not return None intent when passed a list of intents #837

Closed timtutt closed 5 years ago

timtutt commented 5 years ago

Currently the when calling the nlu_engine parse function with a list of intents, it is still possible for the None intent to be returned. This behavior is not expected. Based on the documentation it seems that the scoping of the returned intent should be limited to only intents passed.

The offending block that causes this is this one: https://github.com/snipsco/snips-nlu/blob/90c57f7a636bae9d3ff53875bb48d3ed41a08b80/snips_nlu/nlu_engine/nlu_engine.py#L187-L190

Removing the res[RES_INTENT_NAME] is None condition should fix the issue, but I wanted to understand if there is a reason why this was the chosen behavior?

adrienball commented 5 years ago

@timtutt The documentation is indeed not very clear here, but this is the expected behavior. Initially, when the intents filtering API was designed, we made the assumption that the None intent was always a possible output. To be honest I think both behaviors (filtering out or not filtering out the None intent) could make sense depending on the use case. For this reason, perhaps it should be a flag in the API, but I'm afraid that it would make things more complicated.

To circumvent this, you can pass top_n=2 to the parse method and then select the first parsing result which doesn't correspond to the None intent.

YuukanOO commented 5 years ago

I think None is really needed here since it represents a failure to determine with confidence what's the input means and as such, should always be handled by the caller.

timtutt commented 5 years ago

Thanks for the response here.

Part of the reason I’m looking for this is because in some cases we’ve already done some narrowing of the potential intents and would like to give the user a response based on the best of that list without having to retrain the dataset.

I’m doing post filtering now, so it works as a workaround, but I can see both cases here for why None should or should not be returned. I think a flag would be ideal, but curious as to complications that occur because of this?

On Aug 7, 2019, at 8:55 AM, Julien LEICHER notifications@github.com wrote:

I think None is really needed here since it represents a failure to determine with confidence what's the input means and as such, should always be handled by the caller.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

adrienball commented 5 years ago

@timtutt sorry for the very late follow up. I think it would make the API slightly more complicated because there would be an additional parameter, which must be documented etc. IMO, the parse API is quite simple and efficient at the moment, and the top_n parameter is good enough, I think, to cover your use case. Of course, this can evolve in the future, e.g. if we receive many similar feedbacks like yours. Cheers