Open cbergey opened 4 years ago
I like these!
This seems like a really simple thing to do that could tell us a lot; "what's the diversity of slots getting filled in?" Does growth in frames preceed or follow the growth in lexicality? One could also ask questions about the diversity of how a frame is filled in, for example, consider:
"hand me the NOUN"
How many different nouns fill in the NOUN slot? Or even, what's the entropy—call this the "frame-level entropy". How does frame entropy evolve over time? Are new frames low entropy, and get higher? etc. This could help get at the learning question.
I am information theory ghost and am here to say that one should do something like partial-JSD! But it's not a big deal.
I like this a lot!
Hmmmmm... not sure what this means, but I like it!
[ ] repertoire of frames growing vs. repertoire of utterances (# frames/#utts vs. #type/tokens of utts)
[ ] splitting utterances by sentence into more frames (then re-parse, etc)
[ ] tf-idf on frames where docs = ages -- which frames are most distinctive of ages?
[ ] clustering over frame distributions in conversations
[ ] plain MM
[ ] sequences of latent states in HMM (n-grams?)
[ ] look at traces of recursion in viterbi reconstruction vs. free HMM latent state generation