This code and analysis pipeline was designed and developed by Deena M.A. Gendoo for the following publication:
Whole Genomes Define Concordance of Matched Primary, Xenograft, and Organoid Models of Pancreas Cancer
Please cite: Gendoo DMA et al. Whole genomes define concordance of matched primary, xenograft, and organoid models of pancreas cancer. PLoS Comput Biol. 2019 Jan 10;15(1):e1006596. doi: 10.1371/journal.pcbi.1006596. PMID: 30629588; PMCID: PMC6328084.
URL to manuscript: https://pubmed.ncbi.nlm.nih.gov/30629588/
Questions or Comments: Please email deena.gendoo1984@gmail.com or d.gendoo@bham.ac.uk
This repository hosts code to analyze whole-genome sequencing (WGS) data for the following data types:
The analysis is conducted on Pancreatic Ductal Adenocarcinoma (PDAC), across three cohorts:
NB: The trios are a subset of the 10 primary-PDX pairs.
We describe how to reproduce the statistical analysis as reported in the manuscript. To do this, please proceed to:
We developed and tested our analysis pipeline using R running on Mac OS X platforms.
To mimic our software environment the following R packages should be installed. All these packages are available on CRAN or Bioconductor.
R version 3.3.1 (2016-06-21)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.10.3 (Yosemite)
locale:
[1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8
attached base packages:
[1] grid stats4 parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] GenVisR_1.0.4 plotrix_3.6-4 vcfR_1.4.0
[4] scales_0.4.1 VennDiagram_1.6.17 futile.logger_1.4.3
[7] VariantAnnotation_1.18.7 Rsamtools_1.24.0 Biostrings_2.40.2
[10] XVector_0.12.1 SummarizedExperiment_1.2.3 Biobase_2.32.0
[13] GenomicRanges_1.24.3 GenomeInfoDb_1.8.7 IRanges_2.6.1
[16] S4Vectors_0.10.3 copynumber_1.12.0 BiocGenerics_0.18.0
[19] gplots_3.0.1 RCircos_1.2.0 pheatmap_1.0.8
[22] reshape2_1.4.2
loaded via a namespace (and not attached):
[1] Rcpp_0.12.9 ape_4.0 lattice_0.20-34 gtools_3.5.0
[5] rprojroot_1.2 assertthat_0.1 digest_0.6.11 plyr_1.8.4
[9] backports_1.0.5 futile.options_1.0.0 evaluate_0.10 RSQLite_1.1-2
[13] ggplot2_2.2.1 zlibbioc_1.18.0 GenomicFeatures_1.24.5 lazyeval_0.2.0
[17] gdata_2.17.0 vegan_2.4-2 Matrix_1.2-7.1 rmarkdown_1.6
[21] pinfsc50_1.1.0 BiocParallel_1.6.6 stringr_1.2.0 RCurl_1.95-4.8
[25] biomaRt_2.28.0 munsell_0.4.3 rtracklayer_1.32.2 mgcv_1.8-16
[29] htmltools_0.3.5 gridExtra_2.2.1 tibble_1.2 XML_3.98-1.5
[33] permute_0.9-4 viridisLite_0.1.3 GenomicAlignments_1.8.4 MASS_7.3-45
[37] bitops_1.0-6 nlme_3.1-128 gtable_0.2.0 DBI_0.5-1
[41] magrittr_1.5 KernSmooth_2.23-15 stringi_1.1.2 viridis_0.3.4
[45] lambda.r_1.1.9 RColorBrewer_1.1-2 tools_3.3.1 FField_0.1.0
[49] BSgenome_1.40.1 AnnotationDbi_1.34.4 colorspace_1.3-2 cluster_2.0.5
[53] caTools_1.17.1 memoise_1.0.0 knitr_1.15.1
Once the packages are installed, please download this github repository.
This repository contains three folders:
Each folder is sub-divided into the following folders:
PrimaryPairs: for analysis conducted on 10 primary-PDX pairs
MetastasisPairs: for analysis conducted on 6 metastasis-PDX pairs
Trios: for analysis conducted on 5 primary-PDX-PDO trios (unless otherwise indicated)
To run the analysis, in each subfolder, please first run the R Script that starts with 'STEP1'. Once you run this script, all other scripts in that folder can be run individually, in no particular order.
For subfolders that don't have a 'STEP1', this also means that any of the scripts can be run stand-alone in that folder.
Please consult the attached Supplementary data file in the manuscript, which contains whole-genome sequencing (WGS) data for the following data types:
The format of the Supplementary Data file follows the same format as the code repository. This Data files are split three folders. Below is description of these folders and the types of files contained within them.
Each folder is sub-divided into the following folders:
PrimaryPairs: for analysis conducted on 10 primary-PDX pairs
MetastasisPairs: for analysis conducted on 6 metastasis-PDX pairs
Trios: for analysis conducted on 5 primary-PDX-PDO trios (unless otherwise indicated)
Below is a listing of the files found in each folder and subfolder, for every data type:
PCSI_0169_Pa_P.snp.maf.txt.tsv PCSI_0590_Pa_X.snp.maf.txt.tsv PCSI_0624_Pa_P.snp.maf.txt.tsv
PCSI_0169_Pa_P_526.final.vcf PCSI_0590_Pa_X_526.final.vcf PCSI_0624_Pa_P_526.final.vcf
PCSI_0169_Pa_X.snp.maf.txt.tsv PCSI_0592_Pa_P.snp.maf.txt.tsv PCSI_0624_Pa_X.snp.maf.txt.tsv
PCSI_0169_Pa_X_526.final.vcf PCSI_0592_Pa_P_526.final.vcf PCSI_0624_Pa_X_526.final.vcf
PCSI_0355_Pa_P.snp.maf.txt.tsv PCSI_0592_Pa_X.snp.maf.txt.tsv PCSI_0633_Pa_P.snp.maf.txt.tsv
PCSI_0355_Pa_P_526.final.vcf PCSI_0592_Pa_X_526.final.vcf PCSI_0633_Pa_P_526.final.vcf
PCSI_0355_Pa_X.snp.maf.txt.tsv PCSI_0602_Pa_P.snp.maf.txt.tsv PCSI_0633_Pa_X.snp.maf.txt.tsv
PCSI_0355_Pa_X_526.final.vcf PCSI_0602_Pa_P_526.final.vcf PCSI_0633_Pa_X_526.final.vcf
PCSI_0589_Pa_P.snp.maf.txt.tsv PCSI_0602_Pa_X.snp.maf.txt.tsv PCSI_0642_Pa_P.snp.maf.txt.tsv
PCSI_0589_Pa_P_526.final.vcf PCSI_0602_Pa_X_526.final.vcf PCSI_0642_Pa_P_526.final.vcf
PCSI_0589_Pa_X.snp.maf.txt.tsv PCSI_0611_Pa_P.snp.maf.txt.tsv PCSI_0642_Pa_X.snp.maf.txt.tsv
PCSI_0589_Pa_X_526.final.vcf PCSI_0611_Pa_P_526.final.vcf PCSI_0642_Pa_X_526.final.vcf
PCSI_0590_Pa_P.snp.maf.txt.tsv PCSI_0611_Pa_X.snp.maf.txt.tsv
PCSI_0590_Pa_P_526.final.vcf PCSI_0611_Pa_X_526.final.vcf
PCSI_0489_Lv_M.snp.maf.txt.tsv PCSI_0585_Lv_M.snp.maf.txt.tsv PCSI_0605_Lv_M.snp.maf.txt.tsv
PCSI_0489_Lv_M_526.final.vcf PCSI_0585_Lv_M_526.final.vcf PCSI_0605_Lv_M_526.final.vcf
PCSI_0489_Lv_X.snp.maf.txt.tsv PCSI_0585_Lv_X.snp.maf.txt.tsv PCSI_0605_Lv_X.snp.maf.txt.tsv
PCSI_0489_Lv_X_526.final.vcf PCSI_0585_Lv_X_526.final.vcf PCSI_0605_Lv_X_526.final.vcf
PCSI_0491_Lv_M.snp.maf.txt.tsv PCSI_0604_Lv_M.snp.maf.txt.tsv PCSI_0606_Lv_M.snp.maf.txt.tsv
PCSI_0491_Lv_M_526.final.vcf PCSI_0604_Lv_M_526.final.vcf PCSI_0606_Lv_M_526.final.vcf
PCSI_0491_Lv_X.snp.maf.txt.tsv PCSI_0604_Lv_X.snp.maf.txt.tsv PCSI_0606_Lv_X.snp.maf.txt.tsv
PCSI_0491_Lv_X_526.final.vcf PCSI_0604_Lv_X_526.final.vcf PCSI_0606_Lv_X_526.final.vcf
PCSI_0590_Pa_O.snp.maf.txt.tsv PCSI_0592_Pa_X.snp.maf.txt.tsv PCSI_0624_Pa_P.snp.maf.txt.tsv
PCSI_0590_Pa_O_526.final.vcf PCSI_0592_Pa_X_526.final.vcf PCSI_0624_Pa_P_526.final.vcf
PCSI_0590_Pa_P.snp.maf.txt.tsv PCSI_0602_Pa_O.snp.maf.txt.tsv PCSI_0624_Pa_X.snp.maf.txt.tsv
PCSI_0590_Pa_P_526.final.vcf PCSI_0602_Pa_O_526.final.vcf PCSI_0624_Pa_X_526.final.vcf
PCSI_0590_Pa_X.snp.maf.txt.tsv PCSI_0602_Pa_P.snp.maf.txt.tsv PCSI_0642_Pa_O.snp.maf.txt.tsv
PCSI_0590_Pa_X_526.final.vcf PCSI_0602_Pa_P_526.final.vcf PCSI_0642_Pa_O_526.final.vcf
PCSI_0592_Pa_O.snp.maf.txt.tsv PCSI_0602_Pa_X.snp.maf.txt.tsv PCSI_0642_Pa_P.snp.maf.txt.tsv
PCSI_0592_Pa_O_526.final.vcf PCSI_0602_Pa_X_526.final.vcf PCSI_0642_Pa_P_526.final.vcf
PCSI_0592_Pa_P.snp.maf.txt.tsv PCSI_0624_Pa_O.snp.maf.txt.tsv PCSI_0642_Pa_X.snp.maf.txt.tsv
PCSI_0592_Pa_P_526.final.vcf PCSI_0624_Pa_O_526.final.vcf PCSI_0642_Pa_X_526.final.vcf
PCSI_0169_Pa_P_526.annotatedSV.tsv PCSI_0590_Pa_X_526.annotatedSV.tsv PCSI_0624_Pa_P_526.annotatedSV.tsv
PCSI_0169_Pa_X_526.annotatedSV.tsv PCSI_0592_Pa_P_526.annotatedSV.tsv PCSI_0624_Pa_X_526.annotatedSV.tsv
PCSI_0355_Pa_P_526.annotatedSV.tsv PCSI_0592_Pa_X_526.annotatedSV.tsv PCSI_0633_Pa_P_526.annotatedSV.tsv
PCSI_0355_Pa_X_526.annotatedSV.tsv PCSI_0602_Pa_P_526.annotatedSV.tsv PCSI_0633_Pa_X_526.annotatedSV.tsv
PCSI_0589_Pa_P_526.annotatedSV.tsv PCSI_0602_Pa_X_526.annotatedSV.tsv PCSI_0642_Pa_P_526.annotatedSV.tsv
PCSI_0589_Pa_X_526.annotatedSV.tsv PCSI_0611_Pa_P_526.annotatedSV.tsv PCSI_0642_Pa_X_526.annotatedSV.tsv
PCSI_0590_Pa_P_526.annotatedSV.tsv PCSI_0611_Pa_X_526.annotatedSV.tsv
PCSI_0489_Lv_M_526.annotatedSV.tsv PCSI_0585_Lv_M_526.annotatedSV.tsv PCSI_0605_Lv_M_526.annotatedSV.tsv
PCSI_0489_Lv_X_526.annotatedSV.tsv PCSI_0585_Lv_X_526.annotatedSV.tsv PCSI_0605_Lv_X_526.annotatedSV.tsv
PCSI_0491_Lv_M_526.annotatedSV.tsv PCSI_0604_Lv_M_526.annotatedSV.tsv PCSI_0606_Lv_M_526.annotatedSV.tsv
PCSI_0491_Lv_X_526.annotatedSV.tsv PCSI_0604_Lv_X_526.annotatedSV.tsv PCSI_0606_Lv_X_526.annotatedSV.tsv
PCSI_0590_Pa_O_526.annotatedSV.tsv PCSI_0592_Pa_X_526.annotatedSV.tsv PCSI_0624_Pa_P_526.annotatedSV.tsv
PCSI_0590_Pa_P_526.annotatedSV.tsv PCSI_0602_Pa_O_526.annotatedSV.tsv PCSI_0624_Pa_X_526.annotatedSV.tsv
PCSI_0590_Pa_X_526.annotatedSV.tsv PCSI_0602_Pa_P_526.annotatedSV.tsv PCSI_0642_Pa_O_526.annotatedSV.tsv
PCSI_0592_Pa_O_526.annotatedSV.tsv PCSI_0602_Pa_X_526.annotatedSV.tsv PCSI_0642_Pa_P_526.annotatedSV.tsv
PCSI_0592_Pa_P_526.annotatedSV.tsv PCSI_0624_Pa_O_526.annotatedSV.tsv PCSI_0642_Pa_X_526.annotatedSV.tsv
parameters_PCSI_0169_Pa_P_526.txt parameters_PCSI_0642_Pa_P_526.txt segments_PCSI_0611_CN_Final.bed
parameters_PCSI_0169_Pa_X_526.txt parameters_PCSI_0642_Pa_X_526.txt segments_PCSI_0611_Pa_P_526.txt
parameters_PCSI_0355_Pa_P_526.txt segments_PCSI_0169_CN_Final.bed segments_PCSI_0611_Pa_X_526.txt
parameters_PCSI_0355_Pa_X_526.txt segments_PCSI_0169_Pa_P_526.txt segments_PCSI_0624_CN_Final.bed
parameters_PCSI_0589_Pa_P_526.txt segments_PCSI_0169_Pa_X_526.txt segments_PCSI_0624_Pa_P_526.txt
parameters_PCSI_0589_Pa_X_526.txt segments_PCSI_0355_CN_Final.bed segments_PCSI_0624_Pa_X_526.txt
parameters_PCSI_0590_Pa_P_526.txt segments_PCSI_0355_Pa_P_526.txt segments_PCSI_0633_CN_Final.bed
parameters_PCSI_0590_Pa_X_526.txt segments_PCSI_0355_Pa_X_526.txt segments_PCSI_0633_Pa_P_526.txt
parameters_PCSI_0611_Pa_P_526.txt segments_PCSI_0589_CN_Final.bed segments_PCSI_0633_Pa_X_526.txt
parameters_PCSI_0611_Pa_X_526.txt segments_PCSI_0589_Pa_P_526.txt segments_PCSI_0642_CN_Final.bed
parameters_PCSI_0624_Pa_P_526.txt segments_PCSI_0589_Pa_X_526.txt segments_PCSI_0642_Pa_P_526.txt
parameters_PCSI_0624_Pa_X_526.txt segments_PCSI_0590_CN_Final.bed segments_PCSI_0642_Pa_X_526.txt
parameters_PCSI_0633_Pa_P_526.txt segments_PCSI_0590_Pa_P_526.txt
parameters_PCSI_0633_Pa_X_526.txt segments_PCSI_0590_Pa_X_526.txt
parameters_PCSI_0489_Lv_M_526.txt parameters_PCSI_0606_Lv_M_526.txt segments_PCSI_0585_Lv_X_526.txt
parameters_PCSI_0489_Lv_X_526.txt parameters_PCSI_0606_Lv_X_526.txt segments_PCSI_0604_CN_Final.bed
parameters_PCSI_0491_Lv_M_526.txt segments_PCSI_0489_CN_Final.bed segments_PCSI_0604_Lv_M_526.txt
parameters_PCSI_0491_Lv_X_526.txt segments_PCSI_0489_Lv_M_526.txt segments_PCSI_0604_Lv_X_526.txt
parameters_PCSI_0585_Lv_M_526.txt segments_PCSI_0489_Lv_X_526.txt segments_PCSI_0605_CN_Final.bed
parameters_PCSI_0585_Lv_X_526.txt segments_PCSI_0491_CN_Final.bed segments_PCSI_0605_Lv_M_526.txt
parameters_PCSI_0604_Lv_M_526.txt segments_PCSI_0491_Lv_M_526.txt segments_PCSI_0605_Lv_X_526.txt
parameters_PCSI_0604_Lv_X_526.txt segments_PCSI_0491_Lv_X_526.txt segments_PCSI_0606_CN_Final.bed
parameters_PCSI_0605_Lv_M_526.txt segments_PCSI_0585_CN_Final.bed segments_PCSI_0606_Lv_M_526.txt
parameters_PCSI_0605_Lv_X_526.txt segments_PCSI_0585_Lv_M_526.txt segments_PCSI_0606_Lv_X_526.txt
parameters_PCSI_0590_Pa_O_526.txt parameters_PCSI_0642_Pa_X_526.txt segments_PCSI_0624_Pa_X_526.txt
parameters_PCSI_0590_Pa_P_526.txt segments_PCSI_0590_Pa_O_526.txt segments_PCSI_0624_PvsO_CN_Final.bed
parameters_PCSI_0590_Pa_X_526.txt segments_PCSI_0590_Pa_P_526.txt segments_PCSI_0624_XvsO_CN_Final.bed
parameters_PCSI_0624_Pa_O_526.txt segments_PCSI_0590_Pa_X_526.txt segments_PCSI_0642_Pa_O_526.txt
parameters_PCSI_0624_Pa_P_526.txt segments_PCSI_0590_PvsO_CN_Final.bed segments_PCSI_0642_Pa_P_526.txt
parameters_PCSI_0624_Pa_X_526.txt segments_PCSI_0590_XvsO_CN_Final.bed segments_PCSI_0642_Pa_X_526.txt
parameters_PCSI_0642_Pa_O_526.txt segments_PCSI_0624_Pa_O_526.txt segments_PCSI_0642_PvsO_CN_Final.bed
parameters_PCSI_0642_Pa_P_526.txt segments_PCSI_0624_Pa_P_526.txt segments_PCSI_0642_XvsO_CN_Final.bed