Phenomics / ontolib

A modern Java library for working with (biological) ontologies.
https://ontolib.readthedocs.org
Other
9 stars 2 forks source link

HpoFrequency should return a proportion rather than a percentage #38

Open pnrobinson opened 6 years ago

pnrobinson commented 6 years ago

Currently, the function public int lowerBound() returns a percentage as an integer (e.g., 30 for 30%). This is not good because downstream code very probably would need to have the corresponding proportion (probability), i.e., 0.3. Also, we have frequency data such as 12/34 (for 12 of 34 patients in some study), and this would naturally be represented as a float or double.

The current implementation of fromPercent results in an unnecessary loss of information.

public static HpoFrequency fromPercent(int percent) {
    if (percent < 1) {
      return EXCLUDED;
    } else if (percent < 5) {
      return VERY_RARE;
    } else if (percent < 30) {
      return OCCASIONAL;
    } else if (percent < 80) {
      return FREQUENT;
    } else if (percent < 100) {
      return VERY_FREQUENT;
    } else {
      return ALWAYS_PRESENT;
    }
  }

Therefore, I think that we need to make a new class. Currently, HpoFrequency is an enum. I would suggest we introduce a class called Frequency that would have something like this

public class Frequency {
   Float freq=null;
   HpoFrequency frequencyClass;

   public float getFrequency() {
    if (freq !=null) return freq.getValue():
    else return frequencyClass.upperlimit();
  }
}

I would also suggest that the upper_limit and lower_limit functions in HpoFrequency be revised to return floats. I am in the process of writing code for an HpoDisease class that would parse the phenotype_annotation.tab file and use all of the frequency information, and this would be very useful. Please let me know if anything speaks against this solution. If not, I will implement it and make a PR.

holtgrewe commented 6 years ago

Hm, what about renaming HpoFrequency to HpoFrequencyName and then create an interface hierarchy as below?

pnrobinson commented 6 years ago

Do you mean

interface Frequency {
  float getLowerBound();
  float getUpperBound();
  float getFrequency();
}

with the above three classes implementing the interface? One difficulty is that only the enum has upper and lower bounds.