Closed loj closed 4 years ago
Thanks for sharing your code @loj !
Let's ask @AlexandreHutton if he would share his code for reorganizing these downloads into a BIDS compliant form. That would make an initial set of helper tools complete. Thanks in advance!
Working on it; the scripts were for dealing with an intermediate format that I didn't realize; I'm fixing it to work directly on the downloaded format; it should be up by Monday.
@AlexandreHutton Wonderful, thanks much!
PR #9 created. It adds the scripts as a submodule; I had some difficulties adding the files directly, and this seemed like the easiest solution. Alternatives are fine with me. Note that some of the contents come from another package under Apache 2.0, so it may be worth keeping those files separate regardless.
I think we can close this now. Thx much!
I tested out the
ukb_create_participant_ds
andukb_update_participant_ds
scripts created by @mih using condor to download a 1000 subject subset.To start, I created a csv file with a list of the subjects and modalities that I wanted.
0001234,20227_2_0,20249_2_0,20252_2_0 0001235,20227_2_0,20249_2_0,20252_2_0 0001236,20227_2_0,20249_2_0,20252_2_0 0001237,20227_2_0,20249_2_0,20252_2_0 0001238,20227_2_0,20249_2_0,20252_2_0 0001239,20227_2_0,20249_2_0,20252_2_0 ...
Then, I used the following to call the scripts and submit jobs to condor:
To create the single-participant datasets:
./ukb_create_submit_gen.sh | condor_submit
ukb_create_submit_gen.sh
``` #!/bin/sh logs_dir=~/logs/ukb/create # create the logs dir if it doesn't exist [ ! -d "$logs_dir" ] && mkdir -p "$logs_dir" # print the .submit header printf "# The environment universe = vanilla getenv = True request_cpus = 1 request_memory = 1G # Execution initial_dir = /data/project/rehab_biobank/1000_subset/ executable = /data/project/rehab_biobank/1000_subset/ukb_create_participant_ds \n" # create a job for each subject for line in $(cat subset_rfrmi_tfrmi_t1.csv); do subject_id=${line%%,*} && line=${line#${subject_id},} modalities=$(echo ${line} | sed 's/,/ /g') printf "arguments = ${subject_id} ${subject_id} ${modalities}\n" printf "log = ${logs_dir}/sub-${subject_id}_\$(Cluster).\$(Process).log\n" printf "output = ${logs_dir}/sub-${subject_id}_\$(Cluster).\$(Process).out\n" printf "error = ${logs_dir}/sub-${subject_id}_\$(Cluster).\$(Process).err\n" printf "Queue\n\n" done ```
To download the data:
./ukb_update_submit_gen.sh | condor_submit
ukb_update_submit_gen.sh
``` #!/bin/sh logs_dir=~/logs/ukb/update # create the logs dir if it doesn't exist [ ! -d "$logs_dir" ] && mkdir -p "$logs_dir" # print the .submit header printf "# The environment universe = vanilla getenv = True request_cpus = 1 request_memory = 1G # Execution initial_dir = /data/project/rehab_biobank/1000_subset/ executable = /data/project/rehab_biobank/1000_subset/ukb_update_participant_ds \n" # create a job for each subject for line in $(cat subset_rfrmi_tfrmi_t1.csv); do subject_id=${line%%,*} && line=${line#${subject_id},} printf "arguments = ${subject_id} ../.ukbkey\n" printf "log = ${logs_dir}/sub-${subject_id}_\$(Cluster).\$(Process).log\n" printf "output = ${logs_dir}/sub-${subject_id}_\$(Cluster).\$(Process).out\n" printf "error = ${logs_dir}/sub-${subject_id}_\$(Cluster).\$(Process).err\n" printf "Queue\n\n" done ```