ViCCo-Group / frrsa

Python package to conduct feature-reweighted representational similarity analysis.
https://www.sciencedirect.com/science/article/pii/S105381192200413X
GNU Affero General Public License v3.0
27 stars 3 forks source link

rename frrsa's parameter `distance` (and possibly add a sibling parameter) #25

Closed PhilippKaniuth closed 2 years ago

PhilippKaniuth commented 2 years ago

Currently, the name of the parameter distance is a bit of a misnomer as it can be used to either have squared Euclidean distance being computed within each feature (i.e. a dissimilarity) or to compute the feature-specific dot-product (i.e. a similarity measure, so a distance). What's a good hypernym though for "(dis-)similarity"?

PhilippKaniuth commented 2 years ago

Further, it might be confusing since distance determines both, which (dis-)similarity is used within a single feature (i.e. when applying reweighting) and globally (i.e. for the classical RSA score that is returned for quick comparison).

Solution could be to introduce yet another parameter and rename the distance parameter. Proposals: 1. single_dim_distance (denotes the (dis-)similarity measure used within a feature) and multi_dim_distance (denotes the global (dis-)similarity for classical RSA). 2. reweighted_distance und classical_distance.

This would also allow to compute, for the predicting system, different (dis-)similarity measures for reweighed and classical RSA.

hahahannes commented 2 years ago

Maybe it should be noted that distance also refers to similarity functions. Maybe there is a better naming for distance/similarity?

PhilippKaniuth commented 2 years ago

Yes, what's a good hypernym though for "(dis-)similarity"? I was thinking of "measures". With regard to #32 but also #31, I think I will:

  1. Rename the parameter distance to measures.
  2. The type of measures would be a list of strings with 3 mandatory elements, where the first indicates the "reweighted_distance" (aka "single_dim_distance"); the second indicates the "multi_dim_distance" (aka "classical_distance"); the third indicates the (dis-)similarity of the target matrix (which I need for #30).

What do you think? I really don't want to overload the function with too many (new) parameters.

I am also pondering on whether there should be the possibility at all to indicate different measures for single_dim and mulit_dim. The fact that frrsa outputs classical scores is really just a convenience to be able to quickly compare reweighted and classical scores. However, there is a multitude of measures one could want for the classical case and many are not (and will not) be supported by frrsa. It migh well be that some user wants a measure for the multi_dim case that will never be supported by frrsa because there are better suited alternatives to compute that - in that case one would throw away the classical score frrsa outputs anyway.