hsieh42 / exit_process

Exit process and project boards for tracking issues.
0 stars 0 forks source link

Tomo texture robustness and realism code sharing (analyses too) #14

Closed hsieh42 closed 5 years ago

hsieh42 commented 5 years ago

Organize the texture submission code, and python notebooks so that @eac-upenn can pick the statistics part and @gastouna can pick up the texture pipeline submission using SGE.

hsieh42 commented 5 years ago

Imaging data

Location: /cbica/home/hsiehm/Projects/Breasts/Data/tomo_texture_*. Each directory ends with data acquisition date, they should correspond to when Ray acquired them and how Ray named them originally. Note that the imaging data were organized primarily by Ray, with minor name change on the directories to remove the spaces (linux hates spaces) therefore the imaging data should be almost identical to the original source, Ray's. However, all the image files come with somehow ambiguous names (75_mAs_Trial_1.dcm, eg, no kVp information), I created softlinks for the DICOM files in order to easily distinguish them. For example,

[hsiehm@rhodes Grid_IN]$ pwd
/cbica/home/hsiehm/Projects/Breasts/Data/tomo_texture_20171022/renamed/Processed/Grid_IN
[hsiehm@rhodes Grid_IN]$ ls -lhrt|head
total 0
lrwxrwxrwx. 1 hsiehm sbiauser 109 Nov 27  2017 29_kVp_75_mAs_1.dcm -> /cbica/home/hsiehm/Projects/Breasts/Data/tomo_texture_20171022/29_kVp/MG/Processed/Grid_IN/75_mAs_Trial_1.dcm
lrwxrwxrwx. 1 hsiehm sbiauser 109 Nov 27  2017 29_kVp_75_mAs_2.dcm -> /cbica/home/hsiehm/Projects/Breasts/Data/tomo_texture_20171022/29_kVp/MG/Processed/Grid_IN/75_mAs_Trial_2.dcm
lrwxrwxrwx. 1 hsiehm sbiauser 109 Nov 27  2017 30_kVp_65_mAs_1.dcm -> /cbica/home/hsiehm/Projects/Breasts/Data/tomo_texture_20171022/30_kVp/MG/Processed/Grid_IN/65_mAs_Trial_1.dcm
lrwxrwxrwx. 1 hsiehm sbiauser 109 Nov 27  2017 30_kVp_65_mAs_2.dcm -> /cbica/home/hsiehm/Projects/Breasts/Data/tomo_texture_20171022/30_kVp/MG/Processed/Grid_IN/65_mAs_Trial_2.dcm

An example command to create softlinks is in /cbica/home/hsiehm/Projects/Breasts/Data/tomo_texture_20180328/Readme.txt

for dcm in `find ${PWD} -type f -name "*dcm"`;do  # PWD to be inside tomo_texture_${date}
  kv=`echo $dcm|cut -d/ -f9`; 
  # field argument (-f9 or -f10, -f11 below depends on the full path of the files, but the idea is 
  # to extract kVp, mAs, and modailty information from the directory structure given.
  mas=`echo $dcm|cut -d/ -f11`;
  mod=`echo $dcm|cut -d/ -f10`;
  echo $mod, $kv, $mas;
  mkdir renamed/${mod} -pv;
  ln -sv $dcm renamed/${mod}/${kv}_${mas}.dcm;
done

The directory structure might not be 100% consistent due to the timing of the data reception and complexity of the multi-modal acquisition. You can however always find a "renamed" directory in each tomo_texture_${date} directory that contains the final files for processing. I also always create a list of files for each modality/acquisition/experiement in /cbica/home/hsiehm/Projects/Breasts/Lists/tomo_texture. You will probably find the list names more comprehensible:

[hsiehm@rhodes tomo_texture]$ pwd
/cbica/home/hsiehm/Projects/Breasts/Lists/tomo_texture
[hsiehm@rhodes tomo_texture]$ ls
CVIEW_Batches@                         input_MG_Processed_img_20180728.list
DBT_Case_Control_acc_id.txt            input_MG_Raw_gridIN.list@
DBT_Case_Control_id.list*              input_MG_Raw_gridIN_silver.list
DBT_Case_Control_img.list              input_MG_Raw_img_20180728.list
DBT_Control_acc_id.txt                 input_Processed_Projections_img_20170914.list
DBT_Controls_id.list                   input_Processed_Projections_img_20180404.list
input_cview_img_20170731.list          input_Raw_Projections_img_20170914.list
input_C-View_img_20180404.list         input_Raw_Projections_img_20180404.list
input_img_20170725.list                input_Reconstruction_img_20170725.list@
input_MG_Processed_gridIN.list@        input_Reconstruction_img_20180404.list
[hsiehm@rhodes tomo_texture]$ head input_MG_Processed_gridIN_silver.list
/cbica/home/hsiehm/Projects/Breasts/Data/tomo_texture_20171022/renamed/Processed/Grid_IN/25_kVp_180_mAs_1.dcm
/cbica/home/hsiehm/Projects/Breasts/Data/tomo_texture_20171022/renamed/Processed/Grid_IN/25_kVp_180_mAs_2.dcm
/cbica/home/hsiehm/Projects/Breasts/Data/tomo_texture_20171022/renamed/Processed/Grid_IN/26_kVp_120_mAs_1.dcm
/cbica/home/hsiehm/Projects/Breasts/Data/tomo_texture_20171022/renamed/Processed/Grid_IN/26_kVp_120_mAs_2.dcm
/cbica/home/hsiehm/Projects/Breasts/Data/tomo_texture_20171022/renamed/Processed/Grid_IN/27_kVp_100_mAs_1.dcm
/cbica/home/hsiehm/Projects/Breasts/Data/tomo_texture_20171022/renamed/Processed/Grid_IN/27_kVp_100_mAs_2.dcm

Then the data along with the file lists are ready for texture processing with lattice texture pipeline.

Lattice Texture Pipeline

The submit scripts that submits lattice texture pipeline to the cbica-cluster using SGE are in /cbica/home/hsiehm/Projects/Breasts/Scripts/tomo_texture

[hsiehm@rhodes tomo_texture]$ ls -1rt  ## in chronological reversed order, oldest script on top
sendTexturePipeline.sh  # first submit script that runs original 29 features for HAM phantoms.
sendTexturePipeline_Rachel.sh  # Runs a modified version of computeTextureFeature to work with Rachel phantom
sendTexturePipeline_Rachel_withMask.sh  # Yet the codes modified for Rachel failed for some processed images, therefore another revision to address the issue.
sendTexturePipeline_mammo_Rachel_new_features.sh  # Submits new 344 features for Rachel mammograms
sendTexturePipeline_tomo_new_features.sh  # Submits new features for acquisitions for HAM phantom.
TextureAnalysis_builds/  # special MATLAB (computeTextureFeature) builds.
TextureAnalysis-0.1r168-SVN/  # A special branch of the branch TextureAnalysis-0.1 of revision 168 to address the unit conversion and phantom left-right flip.

The location of the executables (both computeTextureFeature and itk features) are hard-coded inside the scripts for reproducibility purposes.

Job Submission

Usage of the script can be found in the help message when executed without argument:

[hsiehm@rhodes tomo_texture]$ ~/Projects/Breasts/Scripts/tomo_texture/sendTexturePipeline_tomo_new_features.sh

Input your image file list that contains full paths to images with a desired
output directory. The script will submit texture pipeline for each image to
CBICA cluster.

Recommendation: submit this script to the cluster and let the cluster handle all
the work.

        qsub -t - -o odir sendTexturePipeline_tomo_new_features.sh idList odir winsz slidingdist numbin offset radius_edge radius_lbp

========================================================================================
 Usage: sendTexturePipeline_tomo_new_features.sh

 Options:   fileList=$1
            odir=$2
            winsz=$3
            slidingdist=$4

 Example:

 sendTexturePipeline_tomo_new_features.sh /path/to/img.list /path/to/odir/ 6.3 6.3
========================================================================================

For example, to submit the texture pipeline with new features for tomo reconstructed HAM phantom images:

outdir=/cbica/home/hsiehm/Projects/Breasts/Pipelines/tomo_texture/20180404_Reconstruction_new_features/w6.3d6.3/
list=/cbica/home/hsiehm/Projects/Breasts/Lists/tomo_texture/input_Reconstruction_img_20180404.list;
t=`wc -l < $list`;
mkdir ${outdir}/logs -pv;
qsub -o ${outdir}/logs -t 1-$t \
~/Projects/Breasts/Scripts/tomo_texture/sendTexturePipeline_tomo_new_features.sh \
$list $outdir 6.3 6.3

The submission would generate the following in /cbica/home/hsiehm/Projects/Breasts/Pipelines/tomo_texture/20180404_Reconstruction_new_features/w6.3d6.3/

[hsiehm@rhodes w6.3d6.3]$ ls
27_kVp_113_mAs_1/  28_kVp_60_mAs_1/   30_kVp_36_mAs_1/   31_kVp_9_mAs_1/    33_kVp_31_mAs_1/
27_kVp_113_mAs_2/  28_kVp_60_mAs_2/   30_kVp_36_mAs_2/   31_kVp_9_mAs_2/    33_kVp_31_mAs_2/
27_kVp_18_mAs_1/   28_kVp_84_mAs_1/   30_kVp_51_mAs_1/   32_kVp_138_mAs_1/  33_kVp_45_mAs_1/
27_kVp_18_mAs_2/   28_kVp_84_mAs_2/   30_kVp_51_mAs_2/   32_kVp_138_mAs_2/  33_kVp_45_mAs_2/
27_kVp_30_mAs_1/   29_kVp_120_mAs_1/  30_kVp_75_mAs_1/   32_kVp_18_mAs_1/   33_kVp_61_mAs_1/
27_kVp_30_mAs_2/   29_kVp_120_mAs_2/  30_kVp_75_mAs_2/   32_kVp_18_mAs_2/   33_kVp_61_mAs_2/
27_kVp_39_mAs_1/   29_kVp_15_mAs_1/   30_kVp_99_mAs_1/   32_kVp_24_mAs_1/   33_kVp_90_mAs_1/
27_kVp_39_mAs_2/   29_kVp_15_mAs_2/   30_kVp_99_mAs_2/   32_kVp_24_mAs_2/   33_kVp_90_mAs_2/
27_kVp_54_mAs_1/   29_kVp_24_mAs_1/   31_kVp_120_mAs_1/  32_kVp_33_mAs_1/   34_kVp_100_mAs_1/
27_kVp_54_mAs_2/   29_kVp_24_mAs_2/   31_kVp_120_mAs_2/  32_kVp_33_mAs_2/   34_kVp_100_mAs_2/
27_kVp_78_mAs_1/   29_kVp_30_mAs_1/   31_kVp_15_mAs_1/   32_kVp_48_mAs_1/   34_kVp_138_mAs_1/
27_kVp_78_mAs_2/   29_kVp_30_mAs_2/   31_kVp_15_mAs_2/   32_kVp_48_mAs_2/   34_kVp_138_mAs_2/
28_kVp_120_mAs_1/  29_kVp_45_mAs_1/   31_kVp_21_mAs_1/   32_kVp_70_mAs_1/   34_kVp_24_mAs_1/
28_kVp_120_mAs_2/  29_kVp_45_mAs_2/   31_kVp_21_mAs_2/   32_kVp_70_mAs_2/   34_kVp_24_mAs_2/
28_kVp_15_mAs_1/   29_kVp_66_mAs_1/   31_kVp_30_mAs_1/   32_kVp_99_mAs_1/   34_kVp_36_mAs_1/
28_kVp_15_mAs_2/   29_kVp_66_mAs_2/   31_kVp_30_mAs_2/   32_kVp_99_mAs_2/   34_kVp_36_mAs_2/
28_kVp_21_mAs_1/   29_kVp_93_mAs_1/   31_kVp_42_mAs_1/   33_kVp_121_mAs_1/  34_kVp_54_mAs_1/
28_kVp_21_mAs_2/   29_kVp_93_mAs_2/   31_kVp_42_mAs_2/   33_kVp_121_mAs_2/  34_kVp_54_mAs_2/
28_kVp_30_mAs_1/   30_kVp_18_mAs_1/   31_kVp_60_mAs_1/   33_kVp_15_mAs_1/   34_kVp_75_mAs_1/
28_kVp_30_mAs_2/   30_kVp_18_mAs_2/   31_kVp_60_mAs_2/   33_kVp_15_mAs_2/   34_kVp_75_mAs_2/
28_kVp_42_mAs_1/   30_kVp_24_mAs_1/   31_kVp_84_mAs_1/   33_kVp_21_mAs_1/   logs/
28_kVp_42_mAs_2/   30_kVp_24_mAs_2/   31_kVp_84_mAs_2/   33_kVp_21_mAs_2/   sheet/

You can find the output CSV files for feature mean, median, and stdev in sheet/. Feature images and masks are in each subdirectories. logs/ contains processing log files saved by SGE and are for debugging purposes.

hsieh42 commented 5 years ago

Analyses

Most statistical analyses and visualizations were performed in ipython on my personal desktop. Files and figures are in the directory of C:\Users\hsiehm\Documents\SBIA\Breasts\tomo_texture_phantom and a mirror copy in the H drive H:\SBIA\Breasts\tomo_texture_phantom. Two most recent and most select analysis notebooks and presentation slides are also saved in cbica-cluster /cbica/home/hsiehm/Projects/Breasts/Pipelines/tomo_texture/Analyses along with all the pipeline results for phantom (/cbica/home/hsiehm/Projects/Breasts/Pipelines/tomo_texture/), results and DICOM header files for human study (/cbica/home/hsiehm/Projects/Breasts/Pipelines/tomo_texture/Human_Studies) .

The two select notebooks and corresponding slides are: