owlcollab / owltools

OWLTools
BSD 3-Clause "New" or "Revised" License
108 stars 33 forks source link

FastOwlSim's SimGIC scoring discrepancy #187

Open ManavalanG opened 7 years ago

ManavalanG commented 7 years ago

SimGIC score calculated using FastOwlSim doesn't match among output files written in FORMATTED and CSV formats. However, SimGIC calculated using SimpleOwlSim matches with that calculated using FastOwlSim in output format FORMATTED.

Input owl file: simont.owl Properties file: properties.txt General command used: owltools simont.owl --fsim-basic -p properties.txt -o out.txt

  1. Using fsim-basic and output format as CSV: out-fsim-basic_csv.txt
  2. Using fsim-basic and output format as FORMATTED: out-fsim-basic_formatted.txt
  3. Using sim-basic and output format as CSV: out-sim-basic_csv.txt

In both above csv files, I assume values in the last column as SimGIC scores (column titles don't align with those values, or else column title was missing).

Why do scores differ depending on file format? Besides presumed code refactoring, is there any difference between scoring methodology used in FastOwlSim and SimpleOwlSim?

cmungall commented 7 years ago

Hi @ManavalanG - we don't use owlsim2 any more, owlsim3 has its own package: https://github.com/monarch-initiative/owlsim-v3.

ManavalanG commented 7 years ago

Thanks @cmungall! Since owlsim3 is in active development without a release version yet, I wasn't sure it was mature enough for regular use. Is there any documentation available for owlsim3 besides readme in its GitHub repository?

cmungall commented 7 years ago

One of our goals is to have the API be as self-documenting as possible: http://owlsim3.monarchinitiative.org/api/docs/

We should also have the javadocs up soon on mavencentral

But I think what you might like best might be something more aimed at command line usage?

ManavalanG commented 7 years ago

@cmungall - That would be great. Thanks! Yes, command line usage would be ideal, but RESTful API access would do, if I can figure it out. My primary interest is to compare phenotypic profile to HPO-OMIM's data or Monarch's data and obtain similarity score. I have been trying out various similarity scoring systems offered by OwlSim3, but progress has been rather slow due to my very limited understanding of Java. Thanks again!