FCP-INDI / cpac_run_logs

Repository for information about C-PAC runs.
1 stars 1 forks source link

C-PAC v1.8.1-dev ― Debug CUBIC #25

Closed shnizzedy closed 2 years ago

shnizzedy commented 2 years ago

Description

Testing https://github.com/FCP-INDI/C-PAC/commit/60c798c of CUBIC for https://github.com/FCP-INDI/C-PAC/issues/1556

I think I lost the working directory and outputs-so-far when I killed the job :sob:

Version

v1.8.1-dev @https://github.com/FCP-INDI/C-PAC/commit/60c798c

Container

fix_multiple_custom_regressors

System

SGE on CentOS Linux release 7.7.1908 (Core)

Data Size

375

Results

Hanging at

         [LegacyMultiProc] Running 4 tasks, and 4442 jobs ready. Free memory (GB): 44.67/52.00, Free processors: 0/4.
                     Currently running:
                       * _func_generate_ref_863
                       * _func_generate_ref_862
                       * _func_generate_ref_861
                       * _func_generate_ref_860

Run Command:

qsub -cwd -v DSLOCKFILE=/cbica/projects/RBC/CPAC-1.8.1-TESTING/bootstrapped/hbn_exemplars-60c798c/c-pac-1.8.1-dev/analysis/.SGE_datalad_lock -N c-pac_sub-NDARZP630WYL -e /cbica/projects/RBC/CPAC-1.8.1-TESTING/bootstrapped/hbn_exemplars-60c798c/c-pac-1.8.1-dev/analysis/logs -o /cbica/projects/RBC/CPAC-1.8.1-TESTING/bootstrapped/hbn_exemplars-60c798c/c-pac-1.8.1-dev/analysis/logs   /cbica/projects/RBC/CPAC-1.8.1-TESTING/bootstrapped/hbn_exemplars-60c798c/c-pac-1.8.1-dev/analysis/code/participant_job.sh   ria+file:///cbica/projects/RBC/CPAC-1.8.1-TESTING/bootstrapped/hbn_exemplars-60c798c/c-pac-1.8.1-dev/input_ria#d5bcf6a5-cf00-42dc-b564-a6d58fb511b6 /cbica/projects/RBC/CPAC-1.8.1-TESTING/bootstrapped/hbn_exemplars-60c798c/c-pac-1.8.1-dev/output_ria/d5b/cf6a5-cf00-42dc-b564-a6d58fb511b6 sub-NDARZP630WYL

with participant_job.sh and c-pac_zip.sh

Pipeline Config

fx-options

Data Config

BIDS :file_folder::

sub-NDARZP630WYL
└── ses-HBNsiteRU
    ├── anat
    │   ├── sub-NDARZP630WYL_ses-HBNsiteRU_acq-HCP_T1w.json
    │   ├── sub-NDARZP630WYL_ses-HBNsiteRU_acq-HCP_T1w.nii.gz -> ../../../.git/annex/objects/Wv/wX/MD5E-s21430689--b5d2112dbbd0db76239b90201895475c.nii.gz/MD5E-s21430689--b5d2112dbbd0db76239b90201895475c.nii.gz
    │   ├── sub-NDARZP630WYL_ses-HBNsiteRU_acq-HCP_T2w.json
    │   └── sub-NDARZP630WYL_ses-HBNsiteRU_acq-HCP_T2w.nii.gz -> ../../../.git/annex/objects/2q/gm/MD5E-s19053776--a242e3badcbcbbfb3c3e080073b59e8a.nii.gz/MD5E-s19053776--a242e3badcbcbbfb3c3e080073b59e8a.nii.gz
    ├── dwi
    │   ├── sub-NDARZP630WYL_ses-HBNsiteRU_acq-64dir_dwi.bval
    │   ├── sub-NDARZP630WYL_ses-HBNsiteRU_acq-64dir_dwi.bvec
    │   ├── sub-NDARZP630WYL_ses-HBNsiteRU_acq-64dir_dwi.json
    │   └── sub-NDARZP630WYL_ses-HBNsiteRU_acq-64dir_dwi.nii.gz -> ../../../.git/annex/objects/qj/G6/MD5E-s96893104--6637b1701259adaa9783622ac50306a1.nii.gz/MD5E-s96893104--6637b1701259adaa9783622ac50306a1.nii.gz
    ├── fmap
    │   ├── sub-NDARZP630WYL_ses-HBNsiteRU_acq-dwi_dir-AP_epi.json
    │   ├── sub-NDARZP630WYL_ses-HBNsiteRU_acq-dwi_dir-AP_epi.nii.gz -> ../../../.git/annex/objects/X8/7g/MD5E-s899924--f6612d2b2603aa07ba1a3375f0c15342.nii.gz/MD5E-s899924--f6612d2b2603aa07ba1a3375f0c15342.nii.gz
    │   ├── sub-NDARZP630WYL_ses-HBNsiteRU_acq-dwi_dir-PA_epi.json
    │   ├── sub-NDARZP630WYL_ses-HBNsiteRU_acq-dwi_dir-PA_epi.nii.gz -> ../../../.git/annex/objects/Kv/zx/MD5E-s916683--730556c353b27cc4035e4a39fb0906d4.nii.gz/MD5E-s916683--730556c353b27cc4035e4a39fb0906d4.nii.gz
    │   ├── sub-NDARZP630WYL_ses-HBNsiteRU_acq-fMRI_dir-AP_epi.json
    │   ├── sub-NDARZP630WYL_ses-HBNsiteRU_acq-fMRI_dir-AP_epi.nii.gz -> ../../../.git/annex/objects/G8/xf/MD5E-s560908--40ad7c862ad7dde9cb5646d785bc6842.nii.gz/MD5E-s560908--40ad7c862ad7dde9cb5646d785bc6842.nii.gz
    │   ├── sub-NDARZP630WYL_ses-HBNsiteRU_acq-fMRI_dir-PA_epi.json
    │   └── sub-NDARZP630WYL_ses-HBNsiteRU_acq-fMRI_dir-PA_epi.nii.gz -> ../../../.git/annex/objects/Xw/Q4/MD5E-s558790--47872c6fe9e243f305e3f6eeb88960a9.nii.gz/MD5E-s558790--47872c6fe9e243f305e3f6eeb88960a9.nii.gz
    └── func
        ├── sub-NDARZP630WYL_ses-HBNsiteRU_task-movieDM_bold.json
        ├── sub-NDARZP630WYL_ses-HBNsiteRU_task-movieDM_bold.nii.gz -> ../../../.git/annex/objects/9M/k1/MD5E-s346061016--e80a5c2d6fe30d6c03d96f96c2709cdb.nii.gz/MD5E-s346061016--e80a5c2d6fe30d6c03d96f96c2709cdb.nii.gz
        ├── sub-NDARZP630WYL_ses-HBNsiteRU_task-movieDM_events.tsv
        ├── sub-NDARZP630WYL_ses-HBNsiteRU_task-movieTP_bold.json
        ├── sub-NDARZP630WYL_ses-HBNsiteRU_task-movieTP_bold.nii.gz -> ../../../.git/annex/objects/wq/Fw/MD5E-s118101657--fa19e55b22c4f16d292d932a2fddcb6d.nii.gz/MD5E-s118101657--fa19e55b22c4f16d292d932a2fddcb6d.nii.gz
        ├── sub-NDARZP630WYL_ses-HBNsiteRU_task-movieTP_events.tsv
        ├── sub-NDARZP630WYL_ses-HBNsiteRU_task-peer_run-1_bold.json
        ├── sub-NDARZP630WYL_ses-HBNsiteRU_task-peer_run-1_bold.nii.gz -> ../../../.git/annex/objects/xg/6Z/MD5E-s62458379--47ad936d7acca3d3f76594f01e6dc3b3.nii.gz/MD5E-s62458379--47ad936d7acca3d3f76594f01e6dc3b3.nii.gz
        ├── sub-NDARZP630WYL_ses-HBNsiteRU_task-peer_run-2_bold.json
        ├── sub-NDARZP630WYL_ses-HBNsiteRU_task-peer_run-2_bold.nii.gz -> ../../../.git/annex/objects/m1/Ff/MD5E-s62476666--a03ebad152c9141c004b94ef9e28fab7.nii.gz/MD5E-s62476666--a03ebad152c9141c004b94ef9e28fab7.nii.gz
        ├── sub-NDARZP630WYL_ses-HBNsiteRU_task-rest_run-1_bold.json
        ├── sub-NDARZP630WYL_ses-HBNsiteRU_task-rest_run-1_bold.nii.gz -> ../../../.git/annex/objects/kj/MM/MD5E-s173104730--f814a25c082ca01c2cce7faf421bfacc.nii.gz/MD5E-s173104730--f814a25c082ca01c2cce7faf421bfacc.nii.gz
        ├── sub-NDARZP630WYL_ses-HBNsiteRU_task-rest_run-2_bold.json
        └── sub-NDARZP630WYL_ses-HBNsiteRU_task-rest_run-2_bold.nii.gz -> ../../../.git/annex/objects/Wk/q3/MD5E-s174457236--7a743b573fb5225303256732c60e642d.nii.gz/MD5E-s174457236--7a743b573fb5225303256732c60e642d.nii.gz

Default Pipeline Diff

No response

Screenshots of brain extraction and registration wireframe overlays from QC pages

No response

Node timing information

No response

Extracted time series 1D and nuisance regressors 1D correlations against previous version or some benchmark

No response

shnizzedy commented 2 years ago

hangs at

210920-22:51:42,238 nipype.workflow INFO:
         [MultiProc] Running 0 tasks, and 6 jobs ready. Free memory (GB): 51.21/52.00, Free processors: 4/4.

when running multicore with https://github.com/FCP-INDI/C-PAC/commit/07c94072ae0476363c2f1aec3338c8244f20ac5f

full pypeline.log and callback.log and output tree

shnizzedy commented 2 years ago

Completes without error run linearly

shnizzedy commented 2 years ago

:thinking: but in the linear callback log the func_generate_ref nodes use less than 1 GB each:

{"id": "cpac_sub-NDARZP630WYL_ses-HBNsiteRU.func_generate_ref_86", "hash": "210cb86e25058d9eccb18a775e745d19"}
{"id": "cpac_sub-NDARZP630WYL_ses-HBNsiteRU.func_generate_ref_99", "hash": "210cb86e25058d9eccb18a775e745d19"}
{"id": "cpac_sub-NDARZP630WYL_ses-HBNsiteRU.func_generate_ref_86", "hash": "98d5e625ea0de731fd9a4de553267364", "start": "2021-09-16T22:38:53.749713", "finish": "2021-09-16T22:56:33.528740", "runtime_threads": 2, "runtime_memory_gb": 0.5995941162109375, "estimated_memory_gb": 1.655685372748095, "num_threads": 1}
{"id": "cpac_sub-NDARZP630WYL_ses-HBNsiteRU.func_generate_ref_99", "hash": "646ab427197b7fd0215bf4fb1b981f80", "start": "2021-09-16T23:13:18.576898", "finish": "2021-09-16T23:24:09.192114", "runtime_threads": 2, "runtime_memory_gb": 0.60399627734375, "estimated_memory_gb": 1.655685372748095, "num_threads": 1}
{"id": "cpac_sub-NDARZP630WYL_ses-HBNsiteRU.func_generate_ref_86", "hash": "9a174a808b0782ecd3c29760c1fc3500", "start": "2021-09-16T23:27:20.342901", "finish": "2021-09-16T23:38:55.963222", "runtime_threads": 2, "runtime_memory_gb": 0.6040496826171875, "estimated_memory_gb": 1.655685372748095, "num_threads": 1}
{"id": "cpac_sub-NDARZP630WYL_ses-HBNsiteRU.func_generate_ref_99", "hash": "8df0d50dd4a924b5072e9cfd3368e57c", "start": "2021-09-16T23:55:55.470750", "finish": "2021-09-17T00:07:18.506759", "runtime_threads": 2, "runtime_memory_gb": 0.6076202392578125, "estimated_memory_gb": 1.655685372748095, "num_threads": 1}
{"id": "cpac_sub-NDARZP630WYL_ses-HBNsiteRU.func_generate_ref_86", "hash": "a8854e3e011480105c734d088649781a", "start": "2021-09-17T00:09:58.722721", "finish": "2021-09-17T00:13:58.843175", "runtime_threads": 2, "runtime_memory_gb": 0.4946441650390625, "estimated_memory_gb": 1.655685372748095, "num_threads": 1}
{"id": "cpac_sub-NDARZP630WYL_ses-HBNsiteRU.func_generate_ref_99", "hash": "1754cf9c5e3787881c1c2ce1b92beafc", "start": "2021-09-17T00:26:05.631584", "finish": "2021-09-17T00:30:03.338570", "runtime_threads": 1, "runtime_memory_gb": 0.4183235166015625, "estimated_memory_gb": 1.655685372748095, "num_threads": 1}
{"id": "cpac_sub-NDARZP630WYL_ses-HBNsiteRU.func_generate_ref_86", "hash": "3c341a0e09b0ed5ed0fca675923345a1", "start": "2021-09-17T00:31:23.645846", "finish": "2021-09-17T00:35:24.410435", "runtime_threads": 1, "runtime_memory_gb": 0.4190940859375, "estimated_memory_gb": 1.655685372748095, "num_threads": 1}
{"id": "cpac_sub-NDARZP630WYL_ses-HBNsiteRU.func_generate_ref_99", "hash": "edc002f3a596eccdcbca51859b290995", "start": "2021-09-17T00:47:23.394382", "finish": "2021-09-17T00:51:33.763542", "runtime_threads": 2, "runtime_memory_gb": 0.4202690126953125, "estimated_memory_gb": 1.655685372748095, "num_threads": 1}
{"id": "cpac_sub-NDARZP630WYL_ses-HBNsiteRU.func_generate_ref_86", "hash": "7c96373b087e9175dc7d71fdc960c530", "start": "2021-09-17T00:53:10.180471", "finish": "2021-09-17T01:01:03.985834", "runtime_threads": 2, "runtime_memory_gb": 0.51187515234375, "estimated_memory_gb": 1.655685372748095, "num_threads": 1}
{"id": "cpac_sub-NDARZP630WYL_ses-HBNsiteRU.func_generate_ref_99", "hash": "75c347de614ccce55fe60e1c8fb05aae", "start": "2021-09-17T01:15:26.067850", "finish": "2021-09-17T01:23:13.806732", "runtime_threads": 1, "runtime_memory_gb": 0.514621734375, "estimated_memory_gb": 1.655685372748095, "num_threads": 1}
{"id": "cpac_sub-NDARZP630WYL_ses-HBNsiteRU.func_generate_ref_86", "hash": "b50687c8ab68bf5f80356e4908620a90", "start": "2021-09-17T01:26:37.195736", "finish": "2021-09-17T01:49:34.595845", "runtime_threads": 2, "runtime_memory_gb": 0.90906524609375, "estimated_memory_gb": 1.655685372748095, "num_threads": 1}
{"id": "cpac_sub-NDARZP630WYL_ses-HBNsiteRU.func_generate_ref_99", "hash": "b829b40337ebbed9b5ca3403e6d86722", "start": "2021-09-17T02:15:45.294970", "finish": "2021-09-17T02:38:25.855353", "runtime_threads": 2, "runtime_memory_gb": 0.918277740234375, "estimated_memory_gb": 1.655685372748095, "num_threads": 1}
shnizzedy commented 2 years ago

from report.rsts with highest reported peak memory usages:

nuisance_regressors_Regressor-1_149/_scan_movieDM/Functional_2mm_flirt/_report/report.rst

Runtime info

  • cmdline : flirt -in /outputs/working/cpac_sub-NDARZP630WYL_ses-HBNsiteRU/func_slice_timing_correction_91/_scan_movieDM/slice_timing/MD5E-s346061016--e80a5c2d6fe30d6c03d96f96c2709cdb_calc_tshift.nii.gz -ref /outputs/working/cpac_sub-NDARZP630WYL_ses-HBNsiteRU/nuisance_regressors_Regressor-1_149/Anatomical_2mm_flirt/MD5E-s21430689--b5d2112dbbd0db76239b90201895475c_resample_corrected_masked_xform_calc_flirt.nii.gz -out MD5E-s346061016--e80a5c2d6fe30d6c03d96f96c2709cdb_calc_tshift_flirt.nii.gz -omat MD5E-s346061016--e80a5c2d6fe30d6c03d96f96c2709cdb_calc_tshift_flirt.mat -applyxfm -init /outputs/working/cpac_sub-NDARZP630WYL_ses-HBNsiteRU/func_to_anat_bbreg_124/_scan_movieDM/bbreg_func_to_anat/uni_masked_flirt.mat
  • cpu_percent : 121.7
  • duration : 311.323391
  • hostname : 2119fmn023
  • mem_peak_gb : 7.532283783203125
  • prev_wd : /cbica/comp_space/RBC/job-9798091-sub-NDARZP630WYL/ds
  • working_dir : /outputs/working/cpac_sub-NDARZP630WYL_ses-HBNsiteRU/nuisance_regressors_Regressor-1_149/_scan_movieDM/Functional_2mm_flirt
filtering_bold_and_regressors_Regressor-1_165/_scan_movieDM/frequency_filter/_report/report.rst

Runtime info

  • cpu_percent : 163.4
  • duration : 65.330561
  • hostname : 2119fmn023
  • mem_peak_gb : 5.737709044921875
  • prev_wd : /cbica/comp_space/RBC/job-9798091-sub-NDARZP630WYL/ds
  • working_dir : /outputs/working/cpac_sub-NDARZP630WYL_ses-HBNsiteRU/filtering_bold_an d_regressors_Regressor-1_165/_scan_movieDM/frequency_filter
ANTS_T1_to_template_50/cpac_sub-NDARZP630WYL_ses-HBNsiteRU/anat_mni_ants_register/calc_ants_warp/_report/report.rst

Runtime info

  • cpu_percent : 238.4
  • duration : 9019.92792
  • hostname : 2119fmn023
  • mem_peak_gb : 5.181995391601562
  • prev_wd : /cbica/comp_space/RBC/job-9798091-sub-NDARZP630WYL/ds
  • working_dir : /outputs/working/cpac_sub-NDARZP630WYL_ses-HBNsiteRU/ANTS_T1_to_templa te_50/cpac_sub-NDARZP630WYL_ses-HBNsiteRU/anat_mni_ants_register/calc_ants_warp
shnizzedy commented 2 years ago

hangs at

210920-22:51:42,238 nipype.workflow INFO:
         [MultiProc] Running 0 tasks, and 6 jobs ready. Free memory (GB): 51.21/52.00, Free processors: 4/4.

when running multicore with FCP-INDI/C-PAC@07c9407

full pypeline.log and callback.log and output tree

I'm guessing the .79 GB in use is the main process overhead and each of 6 of the pending jobs is estimated to use more than 51.21 GB?