Compiler warnings for missing type arguments in ScoreDistribution and related classes

julesjacobsen commented 5 years ago

There are a lot of these warnings being thrown in the tests:

[WARNING] GitHub/phenol/phenol-io/src/main/java/org/monarchinitiative/phenol/io/scoredist/H2ScoreDistributionReader.java:[141,16] found raw type: org.monarchinitiative.phenol.ontology.scoredist.ObjectScoreDistribution
  missing type arguments for generic class org.monarchinitiative.phenol.ontology.scoredist.ObjectScoreDistribution<T>
[WARNING] GitHub/phenol/phenol-io/src/main/java/org/monarchinitiative/phenol/io/scoredist/H2ScoreDistributionReader.java:[141,12] unchecked call to ObjectScoreDistribution(T,int,int,java.util.SortedMap<java.lang.Double,java.lang.Double>) as a member of the raw type org.monarchinitiative.phenol.ontology.scoredist.ObjectScoreDistribution
[WARNING] GitHub/phenol/phenol-io/src/main/java/org/monarchinitiative/phenol/io/scoredist/H2ScoreDistributionReader.java:[141,12] unchecked conversion
  required: org.monarchinitiative.phenol.ontology.scoredist.ObjectScoreDistribution<T>

Looking at the test code it's not clear what the type ought to be:

// This class is never used in production code.
public class H2ScoreDistributionReader<T extends Serializable> implements ScoreDistributionReader<T> {

// this ObjectScoreDistribution should be typed, but to what?
private ObjectScoreDistribution<T> objectScoreDistributionFromResultSet(ResultSet rs)
      throws SQLException {
    final int termCount = rs.getInt(1);
    final int objectId = rs.getInt(2);
    final int sampleSize = rs.getInt(3);
    final double[] scores = (double[]) rs.getObject(4);
    final double[] pValues = (double[]) rs.getObject(5);
    final TreeMap<Double, Double> scoreDist = new TreeMap<>();
    for (int i = 0; i < scores.length; ++i) {
      scoreDist.put(scores[i], pValues[i]);
    }
    // this ObjectScoreDistribution should be typed, but to what?
    return new ObjectScoreDistribution(termCount, objectId, sampleSize, scoreDist);
  }
}

public interface ScoreDistributionReader<T extends Serializable> extends Closeable {
}

[WARNING] GitHub/phenol/phenol-io/src/main/java/org/monarchinitiative/phenol/io/scoredist/TextFileScoreDistributionReader.java:[23,57] found raw type: org.monarchinitiative.phenol.io.scoredist.ScoreDistributionReader
  missing type arguments for generic class org.monarchinitiative.phenol.io.scoredist.ScoreDistributionReader<T>

// This is missing the generic type declaration from the interface it implements
public class TextFileScoreDistributionReader implements ScoreDistributionReader {

// no 
public ScoreDistribution readForTermCount(int termCount) throws PhenolException {

}

// <T> is never used in this class
public final class SimilarityScoreSampling<T> {

What are these generic types used for? Why declare it if you're not going to use it?

pnrobinson commented 5 years ago

In SimilarityScoreSampling, I cannot figure out what the generic type T is supposed to be used for. I suspect that at some point we refactored this from accepting strings to accepting TermIds, and thus T is no longer needed.

pnrobinson commented 5 years ago

This prevents us from genericising the ObjectIds:

private boolean selectObject(Integer objectId) {
    if (options.getMinObjectId() != null && objectId < options.getMinObjectId()) {
      return false;
    }
    return options.getMaxObjectId() == null || objectId <= options.getMaxObjectId();
  }

Because it assumes we can manipulate T with "<="

kingmanzhang commented 5 years ago

Sorry I did not notice this ticket. I refactored the ObjectScoreDistribution class from Integer as key to generic, because it feels quite constraint to be forced to use Integer in other applications. I did not really understand the H2 database serialization, and still do not know how to use generic on database query:

final int objectId = rs.getInt(2);

This is supposed to be T. I am not sure whether it is possible to do so, otherwise, we may have to revert back to using Integers.

Additionally, this line also seems wrong (first and second variable seems swapped):

return new ObjectScoreDistribution(termCount, objectId, sampleSize, scoreDist);

The class is defined without tests, so the problems were not picked up after refactoring.

I agree that the TextFileScoreDistributionReader also needs to use generic. I will try this later today. If I cannot correct them easily, I will revert the commits. Sorry for this.

julesjacobsen commented 5 years ago

@kingmanzhang please write some tests! These will help define the problem and prevent confusion.

kingmanzhang commented 4 years ago

Sorry for not able to do this in time. I will give it another try, and either fix it or revert my changes. Currently still working with the generic_patch branch.

kingmanzhang commented 4 years ago

I made a last attempt to fix the warnings, but not able to address all. The ' generic_patch ' is the furthest I could mange to get to. It will take more time for someone who understands the score distribution codes better to put class variables there. These classes are only used in the new Phenomiser project. An alternative solution would be to revert two commits I made that were intended to make the scoredistribution class generic. If that's preferred, merge in the 'revert_score_dist_generic' (I recommend doing this after Peter publishes the LIRICAL paper so that you do not need to anything to phenomizer).