ContextLab / quail

A python toolbox for analyzing and plotting free recall data
http://cdl-quail.readthedocs.io/en/latest/
MIT License
20 stars 10 forks source link

feature request: fingerprint analysis #23

Closed andrewheusser closed 7 years ago

andrewheusser commented 7 years ago

integrate fingerprint analyses into pyrec. The code already exists in another repo. We can A) keep the repo separate and import the package and a dependency, or B) integrate it into pyrec and get rid of the other repo. thoughts?

andrewheusser commented 7 years ago

@jeremymanning do you think we should keep the pyfingerprint codebase separate, or fully integrate it into pyrec?

jeremymanning commented 7 years ago

i think clustering analyses (fingerprints) should go in pyrec, not be separate

andrewheusser commented 7 years ago

currently, i have our features (and their distance metrics) hard coded into pyfingerprint. However, if we want this to be flexible, a better model might be to let the user specify their own unique features and distance metrics...I am envisioning the features as a dataframe attached to the pyro object:

features : pd.DataFrame
    Dataframe containing the features for presented words.  Each row represents the presented words for a given list and each column
    represents a list. The cells should be a dictionary of features, where the keys are the name of the features, and the values are the feature values.
    The index will be a multi-index, where the first level reprensents the subject number and the second level represents the list number

but then there should maybe be another field that specifies the distance metrics for each feature? @jeremymanning any thoughts on how we should set this up?

jeremymanning commented 7 years ago

i like keeping things general; this sounds like a good design to me.

in terms of distance metrics, we could default to

we can also let users pass in different distance functions of the form dist(arg1, arg2) (must return a scalar value reflecting the distance between arg1 and arg2).

andrewheusser commented 7 years ago

ok that sounds good - the way im planning to implement the 'custom' functions is a an optional dictionary passed when creating the pyro object:

dist_funcs={}
dist_funcs['category']=lambda x,y: abs(x-y)
pyro = Pyro(pres=pres_data, rec=rec_data, features=features_data, dist_funcs=dist_funcs)

in this example, the 'category' feature will use the custom distance function, and then rest will use the default distance metric. sound good?