Closed vvasco closed 1 year ago
In the analysis provided by google, each word is given a tag, which can be VERB
, NOUN
, DET
, ADP
, ADV
, ADJ
, PRON
and so on. Google also provides the lemma of the word.
In the analysis we do on the top of this, we check if there is any word with tag NOUN
, ADV
, ADJ
, ADP
and, if so, if its lemma corresponds to one of the following words:
speed
: "veloce", "andatura", "velocita", "velocemente", "piano"aid
: "bastone", "muro", "sedia"repetition
: "ripetizione", "volta"feedback
: "cosa", "come", "bene"We defined such words based on examples of questions that came into our mind. I report here examples that work and don't work with the current system for each keyword we have.
speed
:Working | Not working |
---|---|
Sto andando (/mi sto muovendo) troppo piano (/veloce /velocemente) | Sto andando (/mi sto muovendo) troppo lentamente |
A che velocità devo andare (/muovermi) | A che velocità devo fare l'esercizio (/il test) |
Quanto devo andare veloce | |
Che andatura devo avere (/mantenere) |
aid
:Working | Not working |
---|---|
Posso (/E' consentito) usare un (/il mio) bastone (/sedia) | Posso usare il deambulatore |
repetition
:Working | Not working |
---|---|
Quante volte devo ripetere (/fare) | Quante volte devo ripetere l'esercizio (/il test) |
Quante ripetizioni devo fare |
feedback
:Working | Not working |
---|---|
Sto facendo (/andando) bene | Sto facendo (/andando) male |
Come sto andando (/facendo) | Come sto facendo l'esercizio (/il test) |
For the examples I reported, we can see that the system fails in two cases:
aid
, or that including "lentamente" for speed
, or that including "male" for feedback
. NOUN
.For now, we could make the current system more robust to these two failure cases, meanwhile we look for other solutions to directly classify sentences.
@vtikha @pattacini what do you think?
I think it's a good plan 👍
This might also be an interesting project for a thesis.
Currently speech interaction consists in (1) converting speech to text and (2) analyzing the transcripts (see #192). Point 2 is carried out by using google cloud services to retrieve the sentence's structure, in terms of root, verbs and nouns. Such structure is further analyzed to interpret the question by looking for dependencies between root, verbs and nouns. If a dependency is found, we look for specific keywords (such as "veloce", "bastone" etc.). This system currently works for few keywords (
speed
,aid
,repetition
,feedback
), but with a higher number of keywords it might be difficult to cover all the possible dependencies, and thus difficult to extend.An alternative might be to replace the text analysis (2) with a text recognizer, which is directly fed with italian sentences and classifies them into the category of our interest. This would require to create a dataset with examples including our desired categories and use it to train / fine-tune an already existing model. Such system would then be dependent on the specific use case. Google services give the possibility to create a custom machine learning model to classify the text content into domain-specific categories, with a price depending on the training hours, the size of the dataset and the number of the models deployed. Additional resources that might be useful:
cc @pattacini @vtikha