Define complete subset of similarity measures

Climate-Data-Science / Climate-Similarity-Metrics

Which similarity metrics are the most helpful to understand climate

0 stars 2 forks source link

Define complete subset of similarity measures #21

Closed pierretoussing closed 4 years ago

pierretoussing commented 4 years ago

Which similarity measures should be used in order to cover all the problems we want to solve and all the properties similarity measures can have?

[x] Read the section about similarity measures in this paper and inspect if their categorization makes sense for our project.
[x] Define all the problems we want to solve with different similarity measures: Different value ranges, negative values, different value distributions, inverted value ranges,...
[x] Add value ranges to similarity measures summary
[x] Define a complete subset of similarity measures

pierretoussing commented 4 years ago

@pawelbielski I read the section in the paper linked above and I think we should stick with our idea of defining the problems and then select a subset of similarity measures which cover all this problems. This would allow us to be sure we are "complete", because we want to propose a solution for this problems And the modular framework will allow others to implement any other similarity measure that can have this problems.

pierretoussing commented 4 years ago

Problems of different similarity measures:

Negative values (i.e. Pearson's Correlation)
Different value distributions (i.e. Mutual Information)
Different value ranges (i.e. Manhattan Distance/Euclidean Distance)
Inverted value ranges (i.e. Transfer Entropy)

pawelbielski commented 4 years ago

@pierretoussing I agree, that we should start with these 4 problems that similarity measures different than Pearson have. Then for every of the defined problems we give an example measure, and propose solution/solutions.