obophenotype / ncbitaxon

Build for NCBITaxon
BSD 3-Clause "New" or "Revised" License
24 stars 7 forks source link

Add taxon disjoints to subset #75

Closed anitacaron closed 1 year ago

anitacaron commented 1 year ago

Fixes #72

This includes (1) (in_taxon some X) DisjointWith (in_taxon some (not X)) for every taxon X

~I could not test because I don't have the ncbitaxon.obo file and the pipeline needs to download a file that could not resolve curl: (6) Could not resolve host: ftp.ncbi.nih.gov~

matentzn commented 1 year ago

Awesome initiative! I am too young (imagine that) to know the exact nature of the disjoint subset, so I will want to get someones eyes on this that has all the context:

Did you test with the current release files? https://github.com/obophenotype/ncbitaxon/releases/tag/v2023-02-24

anitacaron commented 1 year ago

But we still need to fix the pipeline, right?

anitacaron commented 1 year ago

I can download the file now; maybe it was only a network issue.

anitacaron commented 1 year ago

Having memory issues to generate the disjoint file to the complete taxonomy.

I set 16G of memory to owltools.

root@4e411ec9b54f:/work# time make ncbitaxon-disjoint-over-in-taxon.owl
OWLTOOLS_MEMORY=16G owltools ncbitaxon.owl --create-taxon-disjoint-over-in-taxon --root NCBITaxon:1 --output ncbitaxon-disjoint-over-in-taxon.owl.tmp.owl
Exception in thread "main" java.lang.reflect.InvocationTargetException
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:566)
        at owltools.cli.CommandRunner.runSingleIteration(CommandRunner.java:4779)
        at owltools.cli.CommandRunnerBase.run(CommandRunnerBase.java:76)
        at owltools.cli.CommandRunnerBase.run(CommandRunnerBase.java:68)
        at owltools.cli.CommandLineInterface.main(CommandLineInterface.java:12)
Caused by: java.lang.OutOfMemoryError: Java heap space
        at java.base/java.util.HashMap$KeySet.iterator(HashMap.java:913)
        at java.base/java.util.HashSet.iterator(HashSet.java:173)
        at java.base/java.util.AbstractCollection.toArray(AbstractCollection.java:140)
        at java.base/java.util.ArrayList.<init>(ArrayList.java:179)
        at org.semanticweb.owlapi.util.CollectionFactory.sortOptionally(CollectionFactory.java:131)
        at uk.ac.manchester.cs.owl.owlapi.OWLNaryClassAxiomImpl.<init>(OWLNaryClassAxiomImpl.java:56)
        at uk.ac.manchester.cs.owl.owlapi.OWLDisjointClassesAxiomImpl.<init>(OWLDisjointClassesAxiomImpl.java:42)
        at uk.ac.manchester.cs.owl.owlapi.OWLDataFactoryImpl.getOWLDisjointClassesAxiom(OWLDataFactoryImpl.java:935)
        at uk.ac.manchester.cs.owl.owlapi.OWLDataFactoryImpl.getOWLDisjointClassesAxiom(OWLDataFactoryImpl.java:962)
        at uk.ac.manchester.cs.owl.owlapi.OWLDataFactoryImpl.getOWLDisjointClassesAxiom(OWLDataFactoryImpl.java:972)
        at owltools.cli.TaxonCommandRunner.createDisjoint(TaxonCommandRunner.java:307)
        at owltools.cli.TaxonCommandRunner.createTaxonDisjointOverInTaxon(TaxonCommandRunner.java:257)
        ... 8 more
make: *** [Makefile:81: ncbitaxon-disjoint-over-in-taxon.owl] Error 1

real    74m11.658s
user    538m39.829s
sys     2m19.121s
matentzn commented 1 year ago

I can imagine.. let's drop the full one for now and make a comment on the issue