OTTAA-Project / ottaa_project_flutter

Join us to create the first predictive augmentative communication platform for speech-impaired children!
https://ottaa-project.github.io/
GNU General Public License v3.0
10 stars 4 forks source link

Different Json for different Language and type of user #48

Closed hectoritr closed 2 years ago

hectoritr commented 2 years ago

Describe the solution you'd like When a new user is logging in, we should provide a trained JSON model for their predictions based on their preferred gender and age.

We are covering several types of genders:

And 3 age types:

So, we would need to create 15 different models according to the possible combinations. @gonojuarez will fetch the last data from the database.

@lopezjuanma96 will train and provide the different models. We can use Miguel's algorithm but also use the current metadata on the JSON file to improve the model, like the user's age and time of day of the sentence to TAG them.

@asimjawad will do the flutter implementation.

Comment below when your work is done.

Additional context Add any other context.

Action Plan Add an Action Plan with Checkboxes on key things you have to achieve to complete this task.

lopezjuanma96 commented 2 years ago
lopezjuanma96 commented 2 years ago

ISSUES REGARDING THE PHRASES JSON:

Look at this example of a phrase JSON:

{"frase":"Hola Buen día tengo castillo ","frecuencia":3,"complejidad":{"valor":0,"pictos componentes":[{"id":377,"esSugerencia":false},{"id":379,"horario":["MANANA"],"esSugerencia":false,"hora":["MANANA"]},{"id":49,"esSugerencia":false},{"id":945324633}]},"fecha":[1637327873525,1637327994767,1637329129227],"locale":"es","id":0}

some things we have to decide are:

lopezjuanma96 commented 2 years ago

Span of keys and tags the app is using at the moment:

'key': {'TAGS'}

'hora': {'MANANA', 'MEDIODIA', 'TARDE', 'NOCHE'} 'edad': {'ADULTO', 'JOVEN', 'NINO'} 'sexo': {'FEMENINO', 'MASCULINO', 'BINARIO', 'FLUIDO', 'BINARIO'} 'ubicacion': {'ESTADIO', 'PARQUE',... //we are not using it for now

lopezjuanma96 commented 2 years ago

Possible Sources for Training Miguel's Algorithm:

EDIT: maybe it's best to work with medical questions datasets, which will actually include what our users would say in a medical/hospital context. Then the last dataset from the list before (https://github.com/curai/medical-question-pair-dataset) might be the best to try first, and we should also add others we can find, such as:

this came up after a really quick search in google, as examples, a better research migh give better results even.

lopezjuanma96 commented 2 years ago

Possible Sources for Training Miguel's Algorithm:

lopezjuanma96 commented 2 years ago

Whis might be useful: https://convokit.cornell.edu/documentation/datasets.html

lopezjuanma96 commented 2 years ago

Scientific Dataset: https://www.kaggle.com/datasets/Cornell-University/arxiv/code

The full dataset is REALLY large (1.1TB and growing), but we can download the metadata which have all titles, abstract, authors, categories, etc. With it we can select categories for each model and train with the abstracts or download some of the papers.

hectoritr commented 2 years ago

Download correct model on Login. When the user starts the app and select the Gender and DoB. We should store those value to be use by the prediction algorithm.

Screen Shot 2022-08-25 at 12 17 48

The options currently are

Gender

Age (calculate the right TAG based on the DoB

These values should be stored as profile info of the user and used in the prediction.

Then based on the gender you have to download the right JSON dataset.

hectoritr commented 2 years ago

@asimjawad is this done? Download the right model accordint to the user?

asimjawad commented 2 years ago

@hectoritr we did not do this. add the required api here and I will be on it.

asimjawad commented 2 years ago

@lopezjuanma96 add them here.

hectoritr commented 2 years ago

This was resolved on #101

asimjawad commented 2 years ago

reopening this issue, because work on some parts was not done.

hectoritr commented 2 years ago

@asimjawad here is what I found so far

This was on the default database

Screen Shot 2022-10-05 at 10 40 32

This was on the testing database

Screen Shot 2022-10-05 at 10 41 29
asimjawad commented 2 years ago

@hectoritr can you explain this a further. I will see it in the morning.

hectoritr commented 2 years ago

I tried loggin as a female and changed the default database and not the testing one, I don't know which one are you using in dev. Just that. The main thing is that even though it asked me to choose the the gender it downloaded the male version.

asimjawad commented 2 years ago

@hectoritr we are using default one.

asimjawad commented 2 years ago

and as I told you that, the jsons are only loaded when a user will create a new account... and they will be choosing their gender. At that time we will upload and save the Json according to their gender.