callahantiff / PheKnowLator

PheKnowLator: Heterogeneous Biomedical Knowledge Graphs and Benchmarks Constructed Under Alternative Semantic Models
https://github.com/callahantiff/PheKnowLator/wiki
Apache License 2.0
159 stars 29 forks source link

Cannot Apply OWLAPI Formatting to Very Large KGs #51

Closed callahantiff closed 3 years ago

callahantiff commented 4 years ago

Problem: When applying OWLAPI formatting to very large KGs ~86 million triples (i.e. subclass + inverse relations KG and adding non-ontology metadata) OWLAPI hangs and process never completes. See error message from Dave Farrell below:

The current process that is in the “S” state is hung on a futex wait. At one point, there was a thread with process id of 3912 that was holding a resource. The thread has ended or crashed but it did not release the hold which is why, I believe, the current parent process 3907 is “waiting/sleeping”. In Java, methods such as lock(), park() or unpark() use futex_wait(). Strace is the utility that can help track these events but I do not know how to use it to pinpoint the actual resource being blocked. The command that I used to see what was happening with your process was:

strace -p 3907
strace: Process 3907 attached
futex(0x7f77b8f7f9d0, FUTEX_WAIT, 3912, NULL

Script: knowledge_graph.py

Current Solution: Not adding non-ontology metadata to KG and leaving the rest of the KG build workflow in tact. All ontology and non-ontology metadata (i.e. labels, definitions, and synonyms) get written out to a .txt file the information is still available to users, just not as part of the KG.

callahantiff commented 4 years ago

Update: In addition to seeing this error on Tantor, I have replicated it on a GCP instance with more than 2x as much memory.

ignaciot commented 4 years ago

Wanna try this on fiji? I can help with the job setup.

-i

On Wed, May 13, 2020 at 10:45 AM Tiffany J. Callahan < notifications@github.com> wrote:

Update: In addition to seeing this error on Tantor, I have replicated it on a GCP instance with more than 2x as much memory.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/callahantiff/PheKnowLator/issues/51#issuecomment-628111591, or unsubscribe https://github.com/notifications/unsubscribe-auth/AARDYHWMHPUTQDQPKZZODXTRRLFBFANCNFSM4M75KRCA .

callahantiff commented 4 years ago

Wanna try this on fiji? I can help with the job setup.

Hey @ignaciot! We certainly could, although I think running a Docker container on Fiji is not super straight forward. The Docker container is running fine on the Google GCP instance I built. This issue is more related to some weirdness with the OWLAPI and the fact that I think we might be trying to run a graph bigger than has ever been tested on the tool.

We have a good work around for now, so I think we are OK. @bill-baumgartner and I discussed leaving the issue open for now as reminder 😄

ignaciot commented 4 years ago

Ahh gotcha. Yes, we might have to try it using singularity in that case, I’ll leave it alone then!

-i

On Wed, May 13, 2020 at 5:04 PM Tiffany J. Callahan < notifications@github.com> wrote:

Wanna try this on fiji? I can help with the job setup.

Hey @ignaciot https://github.com/ignaciot! We certainly could, although I think running a Docker container on Fiji is not super straight forward. The Docker container is running fine on the Google GCP instance I built. This issue is more related to some weirdness with the OWLAPI and the fact that I think we might be trying to run a graph bigger than has ever been tested on the tool.

We have a good work around for now, so I think we are OK. @bill-baumgartner https://github.com/bill-baumgartner and I discussed leaving the issue open for now as reminder 😄

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/callahantiff/PheKnowLator/issues/51#issuecomment-628290526, or unsubscribe https://github.com/notifications/unsubscribe-auth/AARDYHTK3CFGKVHMQNVDUVLRRMRQNANCNFSM4M75KRCA .