hsieh42 / exit_process

Exit process and project boards for tracking issues.
0 stars 0 forks source link

Legacy anonymization #41

Open hsieh42 opened 5 years ago

hsieh42 commented 5 years ago

This is an post to demonstrate how to run the old/legacy anonymization procedure. The process would likly break because the executable was build with MATLAB in centos6 while we now only have centos7 on cbica-cluster. Even if the executable still works, it is a piece of inefficient program that would take a lot of time for large amount of studies or large amount of DICOM files.

When you have a batch of data in $input_dir you want to anonymize, and make available only to an $allowed_user in a designated directory $output_dir. You could do the following:

input_dir="/cbica/comp_space/CBIG/Data_phiSensitive/HUP_cohort_Dx/Yifan/"
output_dir="/cbica/projects/CBIG/Anonymized_to_Distribute/Prospr/HUP_cohort_Dx"
script="/cbica/comp_space/CBIG/Data_phiSensitive/Scripts/anonymizationExe_R44.sh"
allowed_user="prospr"

mkdir  $output_dir -pv;

qsub -o $output_dir -l h_vmem=8G $script $input_dir $output_dir $allowed_user

In anonymizationExe_R44.sh, anonymization of the DICOM studies is performed serially for all sub-directories. After the anonymization, another script /cbica/comp_space/CBIG/Scripts/setPermission.sh would be called to assign the read permission of the $output_dir to a specific user/project user. setPermission.sh has to be executed on a compute node through SGE submission and will not take effect if executed on an interactive node, i.e. crete, agora, etc... See Mark's notes on how to use the Anonymized_to_Distribute directory and on where to execute setPermission.sh for more information.