langcog / web-cdi

7 stars 5 forks source link

Update lookup tables for English long and short forms #398

Closed vmarchman closed 1 year ago

vmarchman commented 2 years ago

After the 2020 renorming, we now have new percentile tables for all of the English long forms (WG and WS) and English short forms (Level I, Level IIA, Level IIB) and CDI III.

These numbers will need to be populated into the look-up tables for Web-CDI.

@HenryMehta

HenryMehta commented 2 years ago

@vmarchman Can you provide me with the new percentile tables

vmarchman commented 2 years ago

Hi @Henry @.***> The new tables are attached. Let me know if you have any questions!

On Wed, Aug 17, 2022 at 11:28 PM Henry Mehta @.***> wrote:

@vmarchman https://github.com/vmarchman Can you provide me with the new percentile tables

— Reply to this email directly, view it on GitHub https://github.com/langcog/web-cdi/issues/398#issuecomment-1219087222, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB2TUTHQR2LPXKEVGC32VU3VZXJZPANCNFSM55QLGRHQ . You are receiving this because you were mentioned.Message ID: @.***>

HenryMehta commented 2 years ago

English_shortforms_2022norms.zip English_Longforms_2022norms.zip

HenryMehta commented 2 years ago

@vmarchman We don't currently have benchmarking for short forms. Can you tell me which forms I should be adding it to. Thanks

HenryMehta commented 2 years ago

Also, within WS we seem to have lost Combining, and added Word Forms and Word Endings. I need to match these up to titles within the Scoring. Likely candidates are

{
        "title" : "Word Endings 1",
        "category" : "word_endings_1",
        "measure" : "sometimes;often",
        "order" : 3
    },
    {
        "title" : "Word Forms 1 Nouns",
        "category" : "word_forms_nouns_1",
        "measure" : "produces",
        "order" : 4
    },
    {
        "title" : "Word Forms 1 Verbs",
        "category" : "word_forms_verbs_1",
        "measure" : "produces",
        "order" : 5
    },
    {
        "title" : "Word Forms 2 Nouns",
        "category" : "word_endings_nouns_2",
        "measure" : "produces",
        "order" : 6
    },
    {
        "title" : "Word Forms 2 Verbs",
        "category" : "word_endings_verbs_2",
        "measure" : "produces",
        "order" : 7
    },

If they are a combination of any of these, then we need to amend the scoring file.

I have no idea what WSM3L is.  Also, these are not integer values so I don't know if they'll work
HenryMehta commented 2 years ago

@vmarchman Also, within WS we do not have scoring for Word Endings or Word Forms. We need to add these to scoring before the benchmarks can work. I will apply some scoring and we'll see if I get it right

HenryMehta commented 2 years ago

@vmarchman for WS I've used Word Endings 1 and Word Forms 1 Nouns. I did this rather than creating new categories because new categories will require the scoring to be rerun. If I have to do that, it ok, but I don't want to do it and then redo it because I selected the wrong categories because the rerun takes circa 12 -18 hours

HenryMehta commented 2 years ago

@vmarchman available to test

vmarchman commented 1 year ago

There are new tables for English CDI III as well IIIcomplex_both.csv IIIcomplex_boys.csv IIIcomplex_girls.csv IIIprod_both.csv IIIprod_boys.csv IIIprod_girls.csv IIIuselang_both.csv IIIuselang_boys.csv IIIuselang_girls.csv

HenryMehta commented 1 year ago

@vmarchman I need to understand how these refer to CDI3 scoring.

The scoring json looks like this:

[
    {
        "title" : "Total Produced",
        "category" : "word",
        "measure" : "produces",
        "order" : 1
    },
    {
        "title" : "Combining",
        "category" : "combine",
        "measure" : "sometimes;often",
        "order" : 2
    },
    {
        "title" : "Complexity",
        "category" : "complexity",
        "measure" : "complex",
        "order" : 3
    },
    {
        "title" : "How to Use Words",
        "category" : "usage",
        "measure" : "yes",
        "order" : 4
    },
    {
        "title" : "Combination Example 1",
        "category" : "combination_example1",
        "measure" : "complex",
        "order" : 10,
        "kind" : "list"
    },
    {
        "title" : "Combination Example 2",
        "category" : "combination_example2",
        "measure" : "complex",
        "order" : 11,
        "kind" : "list"
    },
    {
        "title" : "Combination Example 3",
        "category" : "combination_example3",
        "measure" : "complex",
        "order" : 12,
        "kind" : "list"
    }
]

Could you confirm complex is Complexity, prod is Total Produced and uselang is How to Use Words. Thanks

HenryMehta commented 1 year ago

@vmarchman Also, the ages on CDI3 look very very strange. They seem to be 1-4 months

vmarchman commented 1 year ago

Yes, confirm the variables.

Sorry, the ages are 30-31, 32-33, 34-35, 36-37 mos for 1 to 4.

On Wed, Apr 19, 2023, 10:59 AM Henry Mehta @.***> wrote:

@vmarchman https://github.com/vmarchman Also, the ages on CDI3 look very very strange. They seem to be 1-4 months

— Reply to this email directly, view it on GitHub https://github.com/langcog/web-cdi/issues/398#issuecomment-1515146665, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB2TUTDOFGEZF5KZJVMFU3DXCARWTANCNFSM55QLGRHQ . You are receiving this because you were mentioned.Message ID: @.***>

HenryMehta commented 1 year ago

@vmarchman CDI3 benchmarks now loaded and tested. You will need to created and complete a new administration to see the data