cbergey / childlanguageinfo

5 stars 1 forks source link

future ideas #2

Open cbergey opened 4 years ago

cbergey commented 4 years ago
sdedeo commented 4 years ago

I like these!

  1. repertoire of frames growing vs. repertoire of utterances (# frames/#utts vs. #type/tokens of utts)

This seems like a really simple thing to do that could tell us a lot; "what's the diversity of slots getting filled in?" Does growth in frames preceed or follow the growth in lexicality? One could also ask questions about the diversity of how a frame is filled in, for example, consider:

"hand me the NOUN"

How many different nouns fill in the NOUN slot? Or even, what's the entropy—call this the "frame-level entropy". How does frame entropy evolve over time? Are new frames low entropy, and get higher? etc. This could help get at the learning question.

  1. tf-idf on frames where docs = ages -- which frames are most distinctive of ages?

I am information theory ghost and am here to say that one should do something like partial-JSD! But it's not a big deal.

  1. sequences of latent states in HMM (n-grams?)

I like this a lot!

  1. look at traces of recursion in viterbi reconstruction vs. free HMM latent state generation

Hmmmmm... not sure what this means, but I like it!