PhonologicalCorpusTools / CorpusTools

Phonological CorpusTools
http://phonologicalcorpustools.github.io/CorpusTools/
GNU General Public License v3.0
111 stars 16 forks source link

create script for mass pairwise FL calculations #325

Open bhallen opened 9 years ago

bhallen commented 9 years ago

Afterwards: try this for KL-divergence and for mutual information. And then evaluate whether it'll be possible to do with frequency of alternation...

bhallen commented 9 years ago

Completed: this is implemented in functional_load.py as all_pairwise_fls.

Needs to be added to GUI. (CL version is complete.)

kchall commented 9 years ago

@bhallen -- Does this include an option to get what might be called the "average" functional load of a single segment? That is, select one segment, calculate all functional loads for pairs that include that one segment, and return the average value? I've seen that used in a couple of places (Hume et al. 2013; Stevenson 2015), and seems like a natural extension of this (or perhaps a natural subset!).

bhallen commented 9 years ago

@kchall That's a great idea. I just added that functionality into functional_load.py. (No changes to the CL were necessary because both relevant parameters already existed independently.)

mmcauliffe commented 9 years ago

So for implementing this in the GUI, where should it go? Should it be in the Functional load dialog, like a checkbox to do all that then disables the adding of pairs? Or, since we're planning on doing more of these, should we put it into another menu/menu item?

And for the average functional load (these are named "relative_deltah_fl" and "relative_minpair_fl", right?), should that also be added into the existing GUI?

We might want to step back and take a look at the big picture for how everything is organized at some point, since up til now, the functions have either (ignoring the black sheep of acoustic similarity) operated on words (neighborhood density, string similarity, phonotactic probability) or on pairs of segments (frequency of alternation, functional load, mutual information, predictability of distribution, and KL). These new average calculations would be seen as operating on a single segment, and the mass functions as operating on an inventory.

So in that view, it might be good to group them in some more intuitive way, like "Analyze words..." vs "Analyze segment pairs..." vs "Analyze segment inventory".

bhallen commented 9 years ago

@mmcauliffe We talked about this in our meeting today. Kathleen and I in particular both think that from the user-experience standpoint, we're better off organizing analyses by analysis type (functional load, KL, etc.) rather than by analysis argument (corpus, segment pair, single segment). We realize that this means that the existing dialogues for some analyses---especially functional load---could become a little unwieldy if we do this by just adding a lot more radio buttons, so we were thinking it might be best to add a new layer to any analysis types with multiple versions: e.g., when someone selects "functional load", they're first asked to choose whether they want to calculate it for an inventory, for a segment, or for a segment pair.

Does this seem viable to you, Michael?