bbglab / intogen-plus

a framework for automatic and comprehensive knowledge extraction based on mutational data from sequenced tumor samples from patients.
https://www.intogen.org/search
Other
0 stars 1 forks source link

IntOGen Plus | Oncodrive3D implementation and testing #10

Open FedericaBrando opened 10 months ago

FedericaBrando commented 10 months ago
FedericaBrando commented 10 months ago

Testing - HARTWIG_TCGA

Pipeline get stuck as if resources were shared and not fully allocated to single process.

Oncodrive3D is running on several nodes. It has a step that use multiprocessing that divides a for loop in different processes: as many as it is the cpu allocated (in this case 14 cores).

Although some processes have been 2 days stuck in this loop, and in some instances the logs are very weird:

Node bbgn019

2023-10-27 15:07:18,541 - INFO    | oncodrive3d - ######################################################################
2023-10-27 15:07:18,541 - INFO    | oncodrive3d - #                                                                    #
2023-10-27 15:07:18,542 - INFO    | oncodrive3d - #                      Welcome to Oncodrive3D!                       #
2023-10-27 15:07:18,542 - INFO    | oncodrive3d - #                                                                    #
2023-10-27 15:07:18,543 - INFO    | oncodrive3d - #                      Initializing analysis...                      #
2023-10-27 15:07:18,543 - INFO    | oncodrive3d - #                        Version: 2023.08.23                         #
2023-10-27 15:07:18,543 - INFO    | oncodrive3d - #          Author: Biomedical Genomics Lab - IRB Barcelona           #
2023-10-27 15:07:18,544 - INFO    | oncodrive3d - #            Support: stefano.pellegrini@irbbarcelona.org            #
2023-10-27 15:07:18,544 - INFO    | oncodrive3d - #                                                                    #
2023-10-27 15:07:18,545 - INFO    | oncodrive3d - ######################################################################
2023-10-27 15:07:18,545 - INFO    | oncodrive3d - 
2023-10-27 15:07:18,546 - INFO    | oncodrive3d - Input MAF: TCGA_WXS_HNSC.in.tsv.gz
2023-10-27 15:07:18,546 - INFO    | oncodrive3d - Input mut profile: TCGA_WXS_HNSC.sig.json
2023-10-27 15:07:18,547 - INFO    | oncodrive3d - Build directory: /workspace/projects/intogen_plus/fixdatasets-20230223/intogen-plus-dev-o3d/datasets/oncodrive3d
2023-10-27 15:07:18,547 - INFO    | oncodrive3d - Output directory: .
2023-10-27 15:07:18,548 - DEBUG   | oncodrive3d - Path to CMAPs: /workspace/projects/intogen_plus/fixdatasets-20230223/intogen-plus-dev-o3d/datasets/oncodrive3d/prob_cmaps
2023-10-27 15:07:18,548 - DEBUG   | oncodrive3d - Path to DNA sequences: /workspace/projects/intogen_plus/fixdatasets-20230223/intogen-plus-dev-o3d/datasets/oncodrive3d/seq_for_mut_prob.csv
2023-10-27 15:07:18,549 - DEBUG   | oncodrive3d - Path to PAE: /workspace/projects/intogen_plus/fixdatasets-20230223/intogen-plus-dev-o3d/datasets/oncodrive3d/pae
2023-10-27 15:07:18,549 - DEBUG   | oncodrive3d - Path to pLDDT scores: /workspace/projects/intogen_plus/fixdatasets-20230223/intogen-plus-dev-o3d/datasets/oncodrive3d/confidence.csv
2023-10-27 15:07:18,550 - INFO    | oncodrive3d - CPU cores: 14
2023-10-27 15:07:18,550 - INFO    | oncodrive3d - Iterations: 10000
2023-10-27 15:07:18,551 - INFO    | oncodrive3d - Significant level: 0.01
2023-10-27 15:07:18,551 - INFO    | oncodrive3d - Probability threshold for CMAPs: 0.5
2023-10-27 15:07:18,552 - INFO    | oncodrive3d - Disable fragments: False
2023-10-27 15:07:18,552 - INFO    | oncodrive3d - Output only processed genes: True
2023-10-27 15:07:18,553 - INFO    | oncodrive3d - Cohort: TCGA_WXS_HNSC
2023-10-27 15:07:18,553 - INFO    | oncodrive3d - Cancer type: HNSC
2023-10-27 15:07:18,554 - INFO    | oncodrive3d - Verbose: True
2023-10-27 15:07:18,554 - INFO    | oncodrive3d - Seed: 123
2023-10-27 15:07:18,555 - INFO    | oncodrive3d - Log path: ./log
2023-10-27 15:07:18,555 - INFO    | oncodrive3d - 
2023-10-27 15:07:18,556 - DEBUG   | oncodrive3d.utils.utils - Reading input MAF...
2023-10-27 15:07:18,846 - DEBUG   | oncodrive3d.utils.utils - Processing [82100] total mutations...
2023-10-27 15:07:18,917 - DEBUG   | oncodrive3d.utils.utils - Processing [54212] missense mutations...
2023-10-27 15:07:24,944 - DEBUG   | oncodrive3d - Detected [4063] genes without enough mutations: Skipping...
2023-10-27 15:07:36,076 - DEBUG   | oncodrive3d - Detected [13] genes without IDs mapping: Skipping...
2023-10-27 15:07:36,079 - INFO    | oncodrive3d - Computing missense mut probabilities...
2023-10-27 15:08:45,148 - INFO    | oncodrive3d - Performing 3D-clustering on [10561] proteins...
2023-10-27 15:08:45,351 - DEBUG   | oncodrive3d.utils.clustering - Starting [14] processes...
2023-10-27 15:09:20,552 - DEBUG   | oncodrive3d.utils.clustering - Process [1] starting...
[...]
2023-10-27 15:13:04,903 - DEBUG   | oncodrive3d.utils.clustering - Process [7] starting...
[...]
2023-10-27 15:21:18,970 - DEBUG   | oncodrive3d.utils.clustering - Process [8] completed [141/736] structures...
2023-10-27 15:21:21,080 - DEBUG   | oncodrive3d.utils.clustering - Process [3] completed [261/736] structures...

it is stuck there since two days ago. Same with another process:

2023-10-27 17:44:53,693 - INFO    | oncodrive3d - ######################################################################
2023-10-27 17:44:53,693 - INFO    | oncodrive3d - #                                                                    #
2023-10-27 17:44:53,694 - INFO    | oncodrive3d - #                      Welcome to Oncodrive3D!                       #
2023-10-27 17:44:53,694 - INFO    | oncodrive3d - #                                                                    #
2023-10-27 17:44:53,695 - INFO    | oncodrive3d - #                      Initializing analysis...                      #
2023-10-27 17:44:53,695 - INFO    | oncodrive3d - #                        Version: 2023.08.23                         #
2023-10-27 17:44:53,696 - INFO    | oncodrive3d - #          Author: Biomedical Genomics Lab - IRB Barcelona           #
2023-10-27 17:44:53,696 - INFO    | oncodrive3d - #            Support: stefano.pellegrini@irbbarcelona.org            #
2023-10-27 17:44:53,697 - INFO    | oncodrive3d - #                                                                    #
2023-10-27 17:44:53,697 - INFO    | oncodrive3d - ######################################################################
2023-10-27 17:44:53,697 - INFO    | oncodrive3d - 
2023-10-27 17:44:53,698 - INFO    | oncodrive3d - Input MAF: TCGA_WXS_CCRCC.in.tsv.gz
2023-10-27 17:44:53,699 - INFO    | oncodrive3d - Input mut profile: TCGA_WXS_CCRCC.sig.json
[...]
2023-10-27 17:47:52,672 - INFO    | oncodrive3d - Computing missense mut probabilities...
2023-10-27 18:13:06,252 - INFO    | oncodrive3d - Performing 3D-clustering on [3660] proteins...
2023-10-27 18:13:12,476 - DEBUG   | oncodrive3d.utils.clustering - Starting [14] processes...
[...]
2023-10-27 22:42:57,957 - DEBUG   | oncodrive3d.utils.clustering - Process [5] completed [111/255] structures...
NodeName=bbgn019 Arch=x86_64 CoresPerSocket=14
   CPUAlloc=48 CPUErr=0 CPUTot=56 CPULoad=35.18
   AvailableFeatures=(null)
   ActiveFeatures=(null)
   Gres=(null)
   GresDrain=N/A
   NodeAddr=bbgn019 NodeHostName=bbgn019 Version=16.05
   OS=Linux RealMemory=512000 AllocMem=360448 FreeMem=429695 Sockets=2 Boards=1
   State=MIXED ThreadsPerCore=2 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
   BootTime=2023-06-09T13:23:10 SlurmdStartTime=2016-01-01T01:05:30
   CapWatts=n/a
   CurrentWatts=0 LowestJoules=0 ConsumedJoules=0
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s
FedericaBrando commented 10 months ago

ping to @migrau .

Ferriol told me you had similar issue with deepUMI pipeline, how did you solve?

FedericaBrando commented 10 months ago

here the loop that use multi processing: https://github.com/bbglab/clustering_3d/blob/b70857d1f88b215f097d6bf0351a8ce4f8ac5191/scripts/utils/clustering.py#L290C3-L306C9

FedericaBrando commented 10 months ago

Nevermind, I restarted the pipeline and it failed the processes because of exceded memory limit. I increased the memory for Oncorive3D process to 32 and it did not happen again.