Open sgiavasis opened 4 years ago
hey @sgiavasis and @shnizzedy, I've been experiencing this-
[MultiProc] Running 3 tasks, and 4 jobs ready. Free memory (GB): 3.00/16.00, Free processors: 5/8.
Currently running:
* cpac_sub-2842950_ses-1.gen_motion_stats_106.cal_DVARS
* cpac_sub-2842950_ses-1.ANTS_T1_to_template_symmetric_64.anat_mni_ants_register_symmetric.calc_ants_warp
* cpac_sub-2842950_ses-1.gen_motion_stats_106.cal_DVARS
I believe the function is dying out of memory, and even it is running it twice? Not sure if it is a Nipype misbehavior.
Do you have any clues about it?
this is my command line:
docker run -it -v `pwd`/output:/output fcpindi/c-pac s3://fcp-indi/data/Projects/CORR/RawDataBIDS/NKI_TRT /output participant --participant_label sub-2842950 --save_working_dir /output --n_cpus 8 --mem_gb 16 --pipeline_override 'pipeline_setup:
output_directory:
generate_quality_control_images: false'
Skipping bids-validator for S3 datasets...
#### Running C-PAC for sub-2842950
Number of participants to run in parallel: 1
Input directory: s3://fcp-indi/data/Projects/CORR/RawDataBIDS/NKI_TRT
Output directory: /output/output
Working directory: /output
Log directory: /output/log
Remove working directory: False
Available memory: 16.0 (GB)
Available threads: 8
Number of threads for ANTs: 1
Parsing s3://fcp-indi/data/Projects/CORR/RawDataBIDS/NKI_TRT..
Connecting to AWS: fcp-indi anonymously...
gathering files from S3 bucket (s3.Bucket(name='fcp-indi')) for data/Projects/CORR/RawDataBIDS/NKI_TRT
Did not receive any parameters for sub-2842950/ses-1/func/sub-2842950_ses-1_task-breathhold_acq-tr1400ms_run-1_bold.nii.gz, is this a problem?
sub-2842950 ses-2 is missing an anat
Starting participant level processing
Run called with config file /output/cpac_pipeline_config_2021-03-31T17-49-17Z.yml
210331-17:49:20,23 nipype.workflow INFO:
C-PAC version: 1.8.0
Setting maximum number of cores per participant to 8
Setting number of participants at once to 1
Setting OMP_NUM_THREADS to 1
Setting MKL_NUM_THREADS to 1
Setting ANTS/ITK thread usage to 1
Maximum potential number of cores that might be used during this run: 8
I wonder if the problem is a numpy compilation problem, or even a bug from their side, to have a segfault.
Fatal Python error: Segmentation fault
Thread 0x00007ff107f8d700 (most recent call first):
File "/usr/local/miniconda/lib/python3.7/threading.py", line 300 in wait
File "/usr/local/miniconda/lib/python3.7/threading.py", line 552 in wait
File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/utils/profiler.py", line 107 in run
File "/usr/local/miniconda/lib/python3.7/threading.py", line 926 in _bootstrap_inner
File "/usr/local/miniconda/lib/python3.7/threading.py", line 890 in _bootstrap
Current thread 0x00007ff1228a3740 (most recent call first):
File "/usr/local/miniconda/lib/python3.7/site-packages/numpy/lib/function_base.py", line 1273 in diff
File "/code/CPAC/generate_motion_statistics/generate_motion_statistics.py", line 562 in calculate_DVARS
File "/code/CPAC/utils/interfaces/function.py", line 152 in _run_interface
File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/interfaces/base/core.py", line 419 in run
File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/engine/nodes.py", line 741 in _run_command
File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/engine/nodes.py", line 635 in _run_interface
File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/engine/nodes.py", line 516 in run
File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/plugins/multiproc.py", line 67 in run_node
File "/usr/local/miniconda/lib/python3.7/concurrent/futures/process.py", line 239 in _process_worker
File "/usr/local/miniconda/lib/python3.7/multiprocessing/process.py", line 99 in run
@sgiavasis have you seen this lately?
Not sure if anyone else has seen this or can reproduce this- in some of my pipeline runs, specifically on AWS via the Docker container (both latest and nightly), the runs will hang indefinitely at DVARS:
This is reliably happening for me with these two pipelines:
It does not happen in this one: