langcog / web-cdi

7 stars 5 forks source link

errors in Spanish WG and WS output #436

Closed vmarchman closed 1 year ago

vmarchman commented 1 year ago

Hi @HenryMehta It seems like the scoring is not working quite right for the Spanish long forms. Some things are not scoring and there are extra columns left over from the English.

Spanish WG: The following categories don't seem to be summing from the responses, so 0s are being populated when there are responses that should be counted.

Jugar a Ser Adulto Imitacion de Otros Tipos de Actividades de Adultos

And, Late Gestures Summary score should be the sum of the following 3 sub-scores: Acciones con Objetos Jugar a Ser Adulto Imitacion de Otros Tipos de Actividades de Adultos

The following columns should be deleted from the output for Spanish WG: First Signs Phrases Imitation Labeling Words Understood Words Produced First Gestures Games Gestures Adult Gestures Parent Gestures Early Gestures Later Gestures Total Gestures

Spanish WS: The following columns don't appear to be summing from the responses:

Scombine - Maybe this is really Combinar palabras? Formas de Verbos: Presente Formas de Verbos: Sucedieron Formas de Verbos: Imperativos

The following columns should be deleted from the output: Total Produced How Children Use Words Word Forms Combining Complexity

And "exemplo" ==> "ejemplo"

HenryMehta commented 1 year ago

@vmarchman I have removed (I think) the unwanted columns, in both development and production.

Moving from exemplo to ejemplo is not straightforward and I need to think about the approach. Since we have to recalc all scores anyway when this goes lives, that might be when it is done, but I would still need to prove it works beforehand

I'm looking at the scoring now

HenryMehta commented 1 year ago

@vmarchman Could you tell me which Spanish WG you will be testing this on. I want to update the numbers on it and it takes so long to run I don't want to do more than required

Same for WS, but working on WG at the moment

HenryMehta commented 1 year ago

@vmarchman I have removed the incorrect titles and rerun the summaries. Could you confirm if the totals are still incorrect please

HenryMehta commented 1 year ago

(3) For Spanish WG, the gestures sections are still not right. It is not reading these columns, either in the csv or in the clinical report: Jugar a Ser Adulto Imitacion de Otros Tipos de Actividades de Adultos

I am really struggling to find the solution here mostly because I do not understand. I am looking at the data in Spanish WG looking for fields that relate to Jugar a Ser Adulto Imitacion de Otros Tipos de Actividades de Adultos and I'm lost. [Spanish_WG].csv

So let me explain how the scoring works and you might be able to guide me to the cause. We use the scoring file below to define scoring. Each score is defined by a title, category, measure and order. The title is self explanatory. The category relates to any row within Spanish_WG.csv file above in the specified category or categories. The measure is what is needed in order for the score to increase (usually yes), and the order is the order they appear in output.

Spanish_WG_scoring.json.zip

From a scoring perspective I believe we're looking at the score defined as follows:

{
        "title" : "Imitacion de Otros Tipos de Actividades de Adultos",
        "category" : "gestures_parents",
        "measure" : "yes",
        "order" : 12
    },

This will refer to these items in the csv

item_511,barrtrap,gestures,gestures_adult,no; yes,Barre o trapea.,,,,gestures_adult
item_512,llavpuer,gestures,gestures_adult,no; yes,Trata de meter la llave en la puerta.,,,,gestures_adult
item_513,pegamart,gestures,gestures_adult,no; yes,Pega con un martillo.,,,,gestures_adult
item_514,persigna,gestures,gestures_adult,no; yes,Reza y/o se persigna.,,,,gestures_adult
item_515,escrbmaq,gestures,gestures_adult,no; yes,Trata de escribir a máquina.,,,,gestures_adult
item_516,juegleye,gestures,gestures_adult,no; yes,Juega a que está leyendo.,,,,gestures_adult
item_517,fumacig,gestures,gestures_adult,no; yes,Fuma un cigarro.,,,,gestures_adult
item_518,aguaplnt,gestures,gestures_adult,no; yes,Le echa agua a las plantas.,,,,gestures_adult
item_519,tocainst,gestures,gestures_adult,no; yes,"Trata de tocar un instrumento musical (guitarra, tambor, etc.).",,,,gestures_adult
item_520,manecoch,gestures,gestures_adult,no; yes,Juega a manejar el coche.,,,,gestures_adult
item_521,lavaplat,gestures,gestures_adult,no; yes,Lava los platos.,,,,gestures_adult
item_522,sacude,gestures,gestures_adult,no; yes,Sacude.,,,,gestures_adult
item_523,escrplum,gestures,gestures_adult,no; yes,Trata de escribir con un lápiz o una pluma.,,,,gestures_adult
item_524,hacehoyo,gestures,gestures_adult,no; yes,Trata de hacer un hoyo.,,,,gestures_adult
item_525,ponelent,gestures,gestures_adult,no; yes,Se pone unos lentes.,,,,gestures_adult

Can you tell me what is missing?
vmarchman commented 1 year ago

Found it!! In the Spanish_WG_scoring.json it says "gestures_parents" and "gestures_adults" and in the Spanish_WG_csv it leaves off the "s"

🎉

On Fri, Apr 14, 2023 at 7:17 AM Henry Mehta @.***> wrote:

(3) For Spanish WG, the gestures sections are still not right. It is not reading these columns, either in the csv or in the clinical report: Jugar a Ser Adulto Imitacion de Otros Tipos de Actividades de Adultos

I am really struggling to find the solution here mostly because I do not understand. I am looking at the data in Spanish WG looking for fields that relate to Jugar a Ser Adulto Imitacion de Otros Tipos de Actividades de Adultos and I'm lost. [Spanish_WG].csv https://github.com/langcog/web-cdi/files/11233659/Spanish_WG.csv

So let me explain how the scoring works and you might be able to guide me to the cause. We use the scoring file below to define scoring. Each score is defined by a title, category, measure and order. The title is self explanatory. The category relates to any row within Spanish_WG.csv file above in the specified category or categories. The measure is what is needed in order for the score to increase (usually yes), and the order is the order they appear in output.

Spanish_WG_scoring.json.zip https://github.com/langcog/web-cdi/files/11233679/Spanish_WG_scoring.json.zip

From a scoring perspective I believe we're looking at the score defined as follows:


{
"title" : "Imitacion de Otros Tipos de Actividades de Adultos",
"category" : "gestures_parents",
"measure" : "yes",
"order" : 12
},

This will refer to these items in the csv

item_511,barrtrap,gestures,gestures_adult,no; yes,Barre o
trapea.,,,,gestures_adult
item_512,llavpuer,gestures,gestures_adult,no; yes,Trata de meter la llave
en la puerta.,,,,gestures_adult
item_513,pegamart,gestures,gestures_adult,no; yes,Pega con un
martillo.,,,,gestures_adult
item_514,persigna,gestures,gestures_adult,no; yes,Reza y/o se
persigna.,,,,gestures_adult
item_515,escrbmaq,gestures,gestures_adult,no; yes,Trata de escribir a
máquina.,,,,gestures_adult
item_516,juegleye,gestures,gestures_adult,no; yes,Juega a que está
leyendo.,,,,gestures_adult
item_517,fumacig,gestures,gestures_adult,no; yes,Fuma un
cigarro.,,,,gestures_adult
item_518,aguaplnt,gestures,gestures_adult,no; yes,Le echa agua a las
plantas.,,,,gestures_adult
item_519,tocainst,gestures,gestures_adult,no; yes,"Trata de tocar un
instrumento musical (guitarra, tambor, etc.).",,,,gestures_adult
item_520,manecoch,gestures,gestures_adult,no; yes,Juega a manejar el
coche.,,,,gestures_adult
item_521,lavaplat,gestures,gestures_adult,no; yes,Lava los
platos.,,,,gestures_adult
item_522,sacude,gestures,gestures_adult,no; yes,Sacude.,,,,gestures_adult
item_523,escrplum,gestures,gestures_adult,no; yes,Trata de escribir con un
lápiz o una pluma.,,,,gestures_adult
item_524,hacehoyo,gestures,gestures_adult,no; yes,Trata de hacer un
hoyo.,,,,gestures_adult
item_525,ponelent,gestures,gestures_adult,no; yes,Se pone unos
lentes.,,,,gestures_adult

Can you tell me what is missing?

—
Reply to this email directly, view it on GitHub
<https://github.com/langcog/web-cdi/issues/436#issuecomment-1508622483>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AB2TUTEXJQ2HUBFBP777Q4LXBFL6NANCNFSM6AAAAAAWQO6ETY>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
HenryMehta commented 1 year ago

OMG! I'm so sorry. I just couldn't see it. I've made a fix and I'm deploying to dev now. But I will need to rerun the scoring for it to update. I'll set them to update later today so you can test in your evening if you fell so inclined

vmarchman commented 1 year ago

No worries! Thank you @Henry @.***> !!!

On Fri, Apr 14, 2023 at 7:41 AM Henry Mehta @.***> wrote:

OMG! I'm so sorry. I just couldn't see it. I've made a fix and I'm deploying to dev now. But I will need to rerun the scoring for it to update. I'll set them to update later today so you can test in your evening if you fell so inclined

— Reply to this email directly, view it on GitHub https://github.com/langcog/web-cdi/issues/436#issuecomment-1508676518, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB2TUTGD4GYYWVACHJZX2KDXBFOYFANCNFSM6AAAAAAWQO6ETY . You are receiving this because you were mentioned.Message ID: @.***>

vmarchman commented 1 year ago

Hi @Henry @.***> Tested Spanish WG outputs.

All gestures sections are adding correctly! Yay!

But, there are two Gestos Tardios columns, one with an accent and one without. We should delete all three columns with the accent. BUT, the other one doesn't have a "Percentile both" in the csv. So, in the clinical report, the output is "<1" which is wrong.

[image: image.png]

Everything else is correct! :-)

-vm

On Fri, Apr 14, 2023 at 7:43 AM Virginia Marchman @.***> wrote:

No worries! Thank you @Henry @.***> !!!

On Fri, Apr 14, 2023 at 7:41 AM Henry Mehta @.***> wrote:

OMG! I'm so sorry. I just couldn't see it. I've made a fix and I'm deploying to dev now. But I will need to rerun the scoring for it to update. I'll set them to update later today so you can test in your evening if you fell so inclined

— Reply to this email directly, view it on GitHub https://github.com/langcog/web-cdi/issues/436#issuecomment-1508676518, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB2TUTGD4GYYWVACHJZX2KDXBFOYFANCNFSM6AAAAAAWQO6ETY . You are receiving this because you were mentioned.Message ID: @.***>

HenryMehta commented 1 year ago

@vmarchman We have to use the one that is in the benchmark json file. I can take it out but that will mean it needs taking out from everywhere it currently shows. Please confirm

HenryMehta commented 1 year ago

@vmarchman I am going to delete all the Spanish WG scoring data in dev. I am then going to rerun the scoring for your test study. I hope this will make clear where the issue is. At the moment there is too much data for me to look at

HenryMehta commented 1 year ago

@vmarchman ok, I think I've found why we were getting Gestos Tarios twice (once with a stress) and why we weren't getting unisex results and I've corrected in dev. I still have no figures for Comprension temprana Percentile-sex, but I also don't have any benchmark data for this which is why it is giving an error