Phenodigm never terminates

matentzn commented 5 years ago

OWLTOOLS_MEMORY=80G owltools --no-logging data/original_a/hp_hp.owl --sim-save-phenodigm-class-scores -m 2.5 -x HP,HP -a data/original_a/hp_hp_phenodigm_2_5.txt

For example here.

After 12 hours this command does not move on. On a hunch, a checked that data/original_a/hp_hp.owl is coherent (no unsatisfiable classes), and it is. What other reason could there be for the process to run 12 hours +.

In the original config, 14 GB of memory was sufficient!

matentzn commented 5 years ago

@kshefchek any idea?

kshefchek commented 5 years ago

I've also noticed some very long run times with metazoa.owl and owltools - it takes ~12 hours to build the owlsim cache files whereas previously it was around 3-4. It could be a sign of a hardware issue or potentially some part of owltools is not scaling linearly.

I would let it run for a day and see if it finishes, and in the meantime we could also test it on one of our other machines.

matentzn commented 5 years ago

I will let it run again! Thanks!

matentzn commented 5 years ago

@cmungall @ShahimEssaid

Here is the pipeline: https://ci.monarchinitiative.org/view/pipelines/job/monarch-similarity-mapping/

You should use another server for this, as this negatively affects dipper pipelines and can cause both o f them to fail...

matentzn commented 5 years ago

Here is the corresponding Jenkins job: https://github.com/monarch-ebi-dev/monarch_ontology_data/blob/master/Jenkinsfile

matentzn commented 5 years ago

@ShahimEssaid from email:

I changed the pipeline to have: ROBOT_JAVA_ARGS = '-Xmx160G'
but the build log still shows: OWLTOOLS_MEMORY=80G owltools.....
It's probably set in the make file but I didn't check. can that be changed to use the ROBOT_JAVA_ARGS value instead, or some other new pipeline/docker variable if needed ?

matentzn commented 5 years ago

Done: https://github.com/monarch-ebi-dev/monarch_ontology_data/blob/master/Makefile

ShahimEssaid commented 5 years ago

restarted the build

ShahimEssaid commented 5 years ago

This should show the docker memory usage for the running container: https://monarch-jenkins.cgrb.oregonstate.edu/job/tmp-shahim2/3/console

matentzn commented 5 years ago

Shahim, thanks so much. This was perfect and exactly what we needed. This pretty much proofs that we hit the limits of the owltools implementation of phenodigm/semantic similarity - it does not scale anymore sufficiently. We are already on 70 GB RAM and it is still climbing and climbing; and I know for fact that this is not even the worst problem computationally in this run.. I think I need to go back down to the ontologies and reduce them to something that is more tractable.

@cmungall do you know whether the algorithm really needs anything but the class hierarchy? So could i run the reasoner, merge inferred classification, and then strip out everything apart from the classification to speed up reasoning? Or will this alter the results of the semantic similarity algorithms we use?

kshefchek commented 5 years ago

my understanding is stripping out everything but subClassOf would be fine (for phenotypic similarity)

matentzn commented 5 years ago

thanks Kent.

It appears that when the memory consumption hit 97.05GiB , it peaked; from there it went quickly down to 100MB before it started rising again. However, in the log, it seems to me that the process has not moved on yet.

I have created a super dumbed-down version of the ontologies now consisting only of subclass and equivalence class axioms as input for phenodigm (i updated the makefile).. @ShahimEssaid you could try to trigger another run with that config, but I would probably wait until tomorrow to interrupt your current run; i just want to know exactly how long it would take to finish a single phenodigm run so we have a baseline to compare against. Lets see what we find!

matentzn commented 5 years ago

Ah wait! It finished!! Just saw it! AMAZING! SO it was just that! Memory needed to be 100GB!

@ShahimEssaid can you interrupt the current run and restart? I just want to see how fast this would complete. I would also refresh your memory jenkins job as well!

cmungall commented 5 years ago

Still seems weird, we are doing something profligate here

On Wed, Oct 30, 2019 at 8:18 AM Nico Matentzoglu notifications@github.com wrote:

Ah wait! It finished!! Just saw it! AMAZING! SO it was just that! Memory needed to be 100GB!

@ShahimEssaid https://github.com/ShahimEssaid can you interrupt the current run and restart? I just want to see how fast this would complete. I would also refresh your memory jenkins job as well!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/monarch-ebi-dev/monarch_ontology_data/issues/2?email_source=notifications&email_token=AAAMMOINVLX3QMVNGWISQ23QRGQTDA5CNFSM4IJ53ZFKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOECUS3TA#issuecomment-547958220, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAMMOJZIXDN3DIMYIC24GTQRGQTDANCNFSM4IJ53ZFA .

matentzn commented 5 years ago

We are now using a version of the input ontology with only subclass and equivalence class axioms preserved.. The memory is considerably less (~ 30 GB). But yes, we will make sure over time we get to the botton of the high level of memory consumption!

monarch-initiative / exomiser-phenotype-data-revised

Phenodigm never terminates #2