robotology / assistive-rehab

Assistive and Rehabilitative Robotics
https://robotology.github.io/assistive-rehab/doc/mkdocs/site
BSD 3-Clause "New" or "Revised" License
20 stars 11 forks source link

Replace text analysis with text recognition #251

Closed vvasco closed 1 year ago

vvasco commented 4 years ago

Currently speech interaction consists in (1) converting speech to text and (2) analyzing the transcripts (see #192). Point 2 is carried out by using google cloud services to retrieve the sentence's structure, in terms of root, verbs and nouns. Such structure is further analyzed to interpret the question by looking for dependencies between root, verbs and nouns. If a dependency is found, we look for specific keywords (such as "veloce", "bastone" etc.). This system currently works for few keywords (speed, aid, repetition, feedback), but with a higher number of keywords it might be difficult to cover all the possible dependencies, and thus difficult to extend.

An alternative might be to replace the text analysis (2) with a text recognizer, which is directly fed with italian sentences and classifies them into the category of our interest. This would require to create a dataset with examples including our desired categories and use it to train / fine-tune an already existing model. Such system would then be dependent on the specific use case. Google services give the possibility to create a custom machine learning model to classify the text content into domain-specific categories, with a price depending on the training hours, the size of the dataset and the number of the models deployed. Additional resources that might be useful:

cc @pattacini @vtikha

vvasco commented 4 years ago

In the analysis provided by google, each word is given a tag, which can be VERB, NOUN, DET, ADP, ADV, ADJ, PRON and so on. Google also provides the lemma of the word.

In the analysis we do on the top of this, we check if there is any word with tag NOUN, ADV, ADJ, ADP and, if so, if its lemma corresponds to one of the following words:

We defined such words based on examples of questions that came into our mind. I report here examples that work and don't work with the current system for each keyword we have.

speed:

Working Not working
Sto andando (/mi sto muovendo) troppo piano (/veloce /velocemente) Sto andando (/mi sto muovendo) troppo lentamente
A che velocità devo andare (/muovermi) A che velocità devo fare l'esercizio (/il test)
Quanto devo andare veloce
Che andatura devo avere (/mantenere)

aid:

Working Not working
Posso (/E' consentito) usare un (/il mio) bastone (/sedia) Posso usare il deambulatore

repetition:

Working Not working
Quante volte devo ripetere (/fare) Quante volte devo ripetere l'esercizio (/il test)
Quante ripetizioni devo fare

feedback:

Working Not working
Sto facendo (/andando) bene Sto facendo (/andando) male
Come sto andando (/facendo) Come sto facendo l'esercizio (/il test)

For the examples I reported, we can see that the system fails in two cases:

For now, we could make the current system more robust to these two failure cases, meanwhile we look for other solutions to directly classify sentences.
@vtikha @pattacini what do you think?

pattacini commented 4 years ago

I think it's a good plan 👍

vvasco commented 4 years ago

This might also be an interesting project for a thesis.