Support for multiple users

Detailed Description

Using speaker identification, we should be able to identify who is talking to Naomi. This will allow responses to be tailored to the individual. This also means that we can start a database for each user and also associate email addresses with users.

Context

This change would allow Naomi to address me by name, not because it assumes I am speaking to it, but because it actually knows who is speaking to it. This would also allow Naomi to keep individual information, such as separate shopping lists for different members of the family or checking the email address of the person asking, etc. Associating the user with text and email addresses would allow Naomi to identify the user by the email address the message was received from. Naomi could also use the identity of the user to select an ASR model that has been optimized to their voice.

The default implementation of this would only have one user and would recognize everyone as that one user, but would have the infrastructure required to support multiple users.

Possible Implementation

There are several different implementations that can do this. VOSK has an implementation, although it is performed at the same time as the ASR so would not help select the correct ASR model. Ideally you would call the speaker identification first and then the ASR, but since some speaker identification models also do ASR, I'd like to call the speaker identification and then have that call the ASR model and return the transcript and speaker identity back together. The identity of the speaker will be passed to the speechhandler as part of the intent structure.

NaomiProject / Naomi