Closed tingchiafelix closed 4 months ago
@tingchiafelix Please install the specific version of ASCAT and re-run
# This is a forked version ASCAT
remotes::install_github("ShixiangWang/ascat@v3-for-gcap-v1", subdir = "ASCAT")
# A ASCAT version with loose SAM flag, useful sometimes
# remotes::install_github("ShixiangWang/ascat@v3-f1", subdir = "ASCAT")
# See https://github.com/ShixiangWang/gcap/issues/27
Also it's not recommended to have symbol ~
in sample name, e.g. 116655~072-R~AK7A15E12~WES.bwa.final.bam
.
Hi Shixiang,
Thank you for providing more details. However, I still got the error. Would you please take a look and let me know your suggestion?
[1] Reading Tumor LogR data... [1] Reading Tumor BAF data... [1] Reading Germline LogR data... [1] Reading Germline BAF data... [1] Registering SNP locations... [1] Splitting genome in distinct chunks...
Could you share your bam data in private? It seems a error in ASCAT package. Also could you provide all information about the log, not just the last part.
Hi,
Please see the log messages and bam/bai files I used (these files were downloaded/processed from public resources).
Loading required package: ASCAT Loading required package: RColorBrewer Loading required package: splines Loading required package: readr Loading required package: GenomicRanges Loading required package: stats4 Loading required package: BiocGenerics
Attaching package: ‘BiocGenerics’
The following objects are masked from ‘package:stats’:
IQR, mad, sd, var, xtabs
The following objects are masked from ‘package:base’:
anyDuplicated, aperm, append, as.data.frame, basename, cbind,
colnames, dirname, do.call, duplicated, eval, evalq, Filter, Find,
get, grep, grepl, intersect, is.unsorted, lapply, Map, mapply,
match, mget, order, paste, pmax, pmax.int, pmin, pmin.int,
Position, rank, rbind, Reduce, rownames, sapply, setdiff, sort,
table, tapply, union, unique, unsplit, which.max, which.min
Loading required package: S4Vectors
Attaching package: ‘S4Vectors’
The following object is masked from ‘package:utils’:
findMatches
The following objects are masked from ‘package:base’:
expand.grid, I, unname
Loading required package: IRanges Loading required package: GenomeInfoDb Loading required package: parallel Loading required package: doParallel Loading required package: foreach Loading required package: iterators Loading required package: sigminer sigminer version 2.3.0
Citation: Wang, S., Wu, CY., He, MM. et al. Machine learning-based extrachromosomal DNA identification in large-scale cohorts reveals its clinical implications in cancer. Nat Commun 15, 1515 (2024). https://doi.org/10.1038/s41467-024-45479-6
Thanks for your sharing (https://www.dropbox.com/scl/fo/9pf9mkereww28q2yda0md/AAK61fn5wJmihoPxa8AW3s4?rlkey=ix2pyr6nyijr9suk04wxpv3ne&e=1&st=nyvgvewx&dl=0). I will download and check tomorrow
Thank you Shixiang for working on this. Please let me know if you need further information from me.
Best, TC
@tingchiafelix Hi, I just got the result, I cannot reproduce the error. I used the same test environment to debug the issue https://github.com/ShixiangWang/gcap/issues/41 (workflow see https://github.com/ShixiangWang/gcap/tree/master/test-workflow/debug )
Please make sure your R>4.1, ASCAT version ShixiangWang/ascat@51fd695
(check with devtools::session_info()
) and complete annotation data (https://github.com/ShixiangWang/gcap/blob/master/test-workflow/debug/2-prepare.sh)
─ Session info ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
setting value
version R version 4.2.2 (2022-10-31)
os CentOS Linux 7 (Core)
system x86_64, linux-gnu
ui RStudio
language (EN)
collate en_US.UTF-8
ctype en_US.UTF-8
tz Asia/Shanghai
date 2024-05-14
rstudio 2022.12.0+353 Elsbeth Geranium (server)
pandoc NA
─ Packages ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
package * version date (UTC) lib source
ASCAT * 3.0.0 2023-03-31 [1] Github (ShixiangWang/ascat@51fd695)
Biobase * 2.58.0 2022-11-01 [2] Bioconductor
BiocGenerics * 0.44.0 2022-11-01 [2] Bioconductor
BiocManager 1.30.19 2022-10-25 [2] CRAN (R 4.2.2)
bit 4.0.5 2022-11-15 [2] CRAN (R 4.2.2)
bit64 4.0.5 2020-08-30 [2] CRAN (R 4.2.2)
bitops 1.0-7 2021-04-24 [2] CRAN (R 4.2.2)
cachem 1.0.6 2021-08-19 [2] CRAN (R 4.2.2)
Cairo 1.6-0 2022-07-05 [1] CRAN (R 4.2.2)
callr 3.7.3 2022-11-02 [2] CRAN (R 4.2.2)
cli 3.6.0 2023-01-09 [2] CRAN (R 4.2.2)
cluster 2.1.4 2022-08-22 [2] CRAN (R 4.2.2)
codetools 0.2-19 2023-02-01 [2] CRAN (R 4.2.2)
colorspace 2.1-0 2023-01-23 [2] CRAN (R 4.2.2)
crayon 1.5.2 2022-09-29 [2] CRAN (R 4.2.2)
data.table 1.14.8 2023-02-17 [2] CRAN (R 4.2.2)
devtools 2.4.5 2022-10-11 [2] CRAN (R 4.2.2)
digest 0.6.31 2022-12-11 [2] CRAN (R 4.2.2)
doParallel * 1.0.17 2022-02-07 [1] CRAN (R 4.2.2)
dplyr 1.1.0 2023-01-29 [2] CRAN (R 4.2.2)
ellipsis 0.3.2 2021-04-29 [2] CRAN (R 4.2.2)
fansi 1.0.4 2023-01-22 [2] CRAN (R 4.2.2)
fastmap 1.1.0 2021-01-25 [2] CRAN (R 4.2.2)
foreach * 1.5.2 2022-02-02 [2] CRAN (R 4.2.2)
fs 1.6.1 2023-02-06 [2] CRAN (R 4.2.2)
furrr 0.3.1 2022-08-15 [2] CRAN (R 4.2.2)
future 1.31.0 2023-02-01 [2] CRAN (R 4.2.2)
gcap * 1.1.4 2024-04-22 [1] Github (ShixiangWang/gcap@1f1dbd2)
generics 0.1.3 2022-07-05 [2] CRAN (R 4.2.2)
GenomeInfoDb * 1.34.9 2023-02-02 [2] Bioconductor
GenomeInfoDbData 1.2.9 2023-02-24 [2] Bioconductor
GenomicRanges * 1.50.2 2022-12-16 [2] Bioconductor
GetoptLong 1.0.5 2020-12-15 [1] CRAN (R 4.2.2)
ggplot2 3.4.1 2023-02-10 [2] CRAN (R 4.2.2)
GlobalOptions 0.1.2 2020-06-10 [1] CRAN (R 4.2.2)
globals 0.16.2 2022-11-21 [2] CRAN (R 4.2.2)
glue 1.6.2 2022-02-24 [2] CRAN (R 4.2.2)
gridBase 0.4-7 2014-02-24 [1] CRAN (R 4.2.2)
gtable 0.3.1 2022-09-01 [2] CRAN (R 4.2.2)
hms 1.1.2 2022-08-19 [2] CRAN (R 4.2.2)
htmltools 0.5.4 2022-12-07 [2] CRAN (R 4.2.2)
htmlwidgets 1.6.1 2023-01-07 [2] CRAN (R 4.2.2)
httpuv 1.6.9 2023-02-14 [2] CRAN (R 4.2.2)
IRanges * 2.32.0 2022-11-01 [2] Bioconductor
iterators * 1.0.14 2022-02-05 [2] CRAN (R 4.2.2)
jsonlite 1.8.4 2022-12-06 [2] CRAN (R 4.2.2)
later 1.3.0 2021-08-18 [2] CRAN (R 4.2.2)
lattice 0.20-45 2021-09-22 [2] CRAN (R 4.2.2)
lgr 0.4.4 2022-09-05 [1] CRAN (R 4.2.2)
lifecycle 1.0.3 2022-10-07 [2] CRAN (R 4.2.2)
listenv 0.9.0 2022-12-16 [2] CRAN (R 4.2.2)
magrittr 2.0.3 2022-03-30 [2] CRAN (R 4.2.2)
Matrix 1.6-5 2024-01-11 [1] CRAN (R 4.2.2)
memoise 2.0.1 2021-11-26 [2] CRAN (R 4.2.2)
mime 0.12 2021-09-28 [2] CRAN (R 4.2.2)
miniUI 0.1.1.1 2018-05-18 [2] CRAN (R 4.2.2)
munsell 0.5.0 2018-06-12 [2] CRAN (R 4.2.2)
NMF 0.26 2023-03-20 [1] CRAN (R 4.2.2)
parallelly 1.34.0 2023-01-13 [2] CRAN (R 4.2.2)
pillar 1.8.1 2022-08-19 [2] CRAN (R 4.2.2)
pkgbuild 1.4.0 2022-11-27 [2] CRAN (R 4.2.2)
pkgconfig 2.0.3 2019-09-22 [2] CRAN (R 4.2.2)
pkgload 1.3.2 2022-11-16 [2] CRAN (R 4.2.2)
plyr 1.8.8 2022-11-11 [2] CRAN (R 4.2.2)
prettyunits 1.1.1 2020-01-24 [2] CRAN (R 4.2.2)
processx 3.8.0 2022-10-26 [2] CRAN (R 4.2.2)
profvis 0.3.7 2020-11-02 [2] CRAN (R 4.2.2)
promises 1.2.0.1 2021-02-11 [2] CRAN (R 4.2.2)
ps 1.7.2 2022-10-26 [2] CRAN (R 4.2.2)
purrr 1.0.1 2023-01-10 [2] CRAN (R 4.2.2)
quadprog 1.5-8 2019-11-20 [1] CRAN (R 4.2.2)
R6 2.5.1 2021-08-19 [2] CRAN (R 4.2.2)
rappdirs 0.3.3 2021-01-31 [2] CRAN (R 4.2.2)
RColorBrewer * 1.1-3 2022-04-03 [2] CRAN (R 4.2.2)
Rcpp 1.0.10 2023-01-22 [2] CRAN (R 4.2.2)
RCurl 1.98-1.10 2023-01-27 [2] CRAN (R 4.2.2)
readr * 2.1.4 2023-02-10 [2] CRAN (R 4.2.2)
registry 0.5-1 2019-03-05 [1] CRAN (R 4.2.2)
remotes 2.4.2 2021-11-30 [2] CRAN (R 4.2.2)
reshape2 1.4.4 2020-04-09 [2] CRAN (R 4.2.2)
rjson 0.2.21 2022-01-09 [1] CRAN (R 4.2.2)
rlang 1.0.6 2022-09-24 [2] CRAN (R 4.2.2)
rngtools 1.5.2 2021-09-20 [1] CRAN (R 4.2.2)
rstudioapi 0.14 2022-08-22 [2] CRAN (R 4.2.2)
S4Vectors * 0.36.1 2022-12-05 [2] Bioconductor
scales 1.2.1 2022-08-20 [2] CRAN (R 4.2.2)
sessioninfo 1.2.2 2021-12-06 [2] CRAN (R 4.2.2)
shiny 1.7.4 2022-12-15 [2] CRAN (R 4.2.2)
sigminer * 2.1.9 2022-11-09 [1] CRAN (R 4.2.2)
stringi 1.7.12 2023-01-11 [2] CRAN (R 4.2.2)
stringr 1.5.0 2022-12-02 [2] CRAN (R 4.2.2)
tibble 3.1.8 2022-07-22 [2] CRAN (R 4.2.2)
tidyr 1.3.0 2023-01-24 [2] CRAN (R 4.2.2)
tidyselect 1.2.0 2022-10-10 [2] CRAN (R 4.2.2)
tzdb 0.3.0 2022-03-28 [2] CRAN (R 4.2.2)
urlchecker 1.0.1 2021-11-30 [2] CRAN (R 4.2.2)
usethis 2.1.6 2022-05-25 [2] CRAN (R 4.2.2)
utf8 1.2.3 2023-01-31 [2] CRAN (R 4.2.2)
uuid 1.1-0 2022-04-19 [2] CRAN (R 4.2.2)
vctrs 0.5.2 2023-01-23 [2] CRAN (R 4.2.2)
vroom 1.6.1 2023-01-22 [2] CRAN (R 4.2.2)
withr 2.5.0 2022-03-03 [2] CRAN (R 4.2.2)
xgboost 1.5.2.1 2022-02-21 [1] CRAN (R 4.2.2)
xtable 1.8-4 2019-04-21 [2] CRAN (R 4.2.2)
XVector 0.38.0 2022-11-01 [2] Bioconductor
zlibbioc 1.44.0 2022-11-01 [2] Bioconductor
[1] /data3/wsx/R/x86_64-pc-linux-gnu-library/4.2
[2] /opt/R/4.2.2/lib/R/library
> library(gcap)
Loading required package: ASCAT
Loading required package: RColorBrewer
Loading required package: splines
Loading required package: readr
Loading required package: GenomicRanges
Loading required package: stats4
Loading required package: BiocGenerics
Attaching package: ‘BiocGenerics’
The following objects are masked from ‘package:stats’:
IQR, mad, sd, var, xtabs
The following objects are masked from ‘package:base’:
anyDuplicated, aperm, append, as.data.frame, basename, cbind, colnames, dirname, do.call, duplicated, eval, evalq,
Filter, Find, get, grep, grepl, intersect, is.unsorted, lapply, Map, mapply, match, mget, order, paste, pmax, pmax.int,
pmin, pmin.int, Position, rank, rbind, Reduce, rownames, sapply, setdiff, sort, table, tapply, union, unique, unsplit,
which.max, which.min
Loading required package: S4Vectors
Attaching package: ‘S4Vectors’
The following objects are masked from ‘package:base’:
expand.grid, I, unname
Loading required package: IRanges
Loading required package: GenomeInfoDb
Loading required package: parallel
Loading required package: doParallel
Loading required package: foreach
Loading required package: iterators
Loading required package: sigminer
sigminer version 2.1.9
- Star me at https://github.com/ShixiangWang/sigminer
- Run hello() to see usage and citation.
gcap version 1.1.4
- Project URL at https://github.com/ShixiangWang/gcap
Citation:
Wang, S., Wu, CY., He, MM. et al. Machine learning-based extrachromosomal DNA identification in
large-scale cohorts reveals its clinical implications in cancer. Nat Commun 15, 1515 (2024). https://doi.org/10.1038/s41467-024-45479-6
> # hg38 ----------------
> gcap.workflow(
+ tumourseqfile = "~/gcap_debug/116655-072-R-AK7A15E12-WES.bwa.final.bam",
+ normalseqfile = "~/gcap_debug/116655_germline-WES.bwa.final.bam",
+ tumourname = "Test_T",
+ normalname = "Test_N",
+ jobname = "S116655",
+ outdir = "~/gcap_debug/gcap_result",
+ allelecounter_exe = "~/miniconda3/envs/cancerit/bin/alleleCounter",
+ g1000allelesprefix = file.path(
+ "~/share/gcap_reference/1000G_loci_hg38/",
+ "1kg.phase3.v5a_GRCh38nounref_allele_index_chr"
+ ),
+ g1000lociprefix = file.path("~/share/gcap_reference/1000G_loci_hg38/",
+ "1kg.phase3.v5a_GRCh38nounref_loci_chrstring_chr"
+ ),
+ GCcontentfile = "~/share/gcap_reference/GC_correction_hg38.txt",
+ replictimingfile = "~/share/gcap_reference/RT_correction_hg38.txt",
+ skip_finished_ASCAT = TRUE,
+ skip_ascat_call = FALSE,
+ result_file_prefix = "S116655",
+ genome_build = "hg38",
+ model = "XGB11"
+ )
<gcap> 2024-05-14 09:47:17 info [gcap.workflow]: =====================
<gcap> 2024-05-14 09:47:17 info [gcap.workflow]: GCAP WORKFLOW
<gcap> 2024-05-14 09:47:17 info [gcap.workflow]: =====================
<gcap> 2024-05-14 09:47:17 info [gcap.workflow]:
<gcap> 2024-05-14 09:47:17 info [gcap.workflow]: =====================
<gcap> 2024-05-14 09:47:17 info [gcap.workflow]: Step 1: Run ASCAT 3.0
<gcap> 2024-05-14 09:47:17 info [gcap.workflow]: =====================
<gcap> 2024-05-14 09:47:17 info [gcap.runASCAT]: > Run ASCAT on WES data <
<gcap> 2024-05-14 09:47:17 info [gcap.runASCAT]:
<gcap> 2024-05-14 09:47:17 info [gcap.runASCAT]: Configs:
<gcap> 2024-05-14 09:47:17 info [gcap.runASCAT]: result path set to /data3/wsx/gcap_debug/gcap_result
<gcap> 2024-05-14 09:47:17 info [gcap.runASCAT]: allelecounter_exe set to ~/miniconda3/envs/cancerit/bin/alleleCounter
<gcap> 2024-05-14 09:47:17 info [gcap.runASCAT]: g1000allelesprefix set to ~/share/gcap_reference/1000G_loci_hg38//1kg.phase3.v5a_GRCh38nounref_allele_index_chr
<gcap> 2024-05-14 09:47:17 info [gcap.runASCAT]: g1000lociprefix set to ~/share/gcap_reference/1000G_loci_hg38//1kg.phase3.v5a_GRCh38nounref_loci_chrstring_chr
<gcap> 2024-05-14 09:47:17 info [gcap.runASCAT]: GCcontentfile set to ~/share/gcap_reference/GC_correction_hg38.txt
<gcap> 2024-05-14 09:47:17 info [gcap.runASCAT]: replictimingfile set to ~/share/gcap_reference/RT_correction_hg38.txt
<gcap> 2024-05-14 09:47:17 info [gcap.runASCAT]: nthreads set to 22
<gcap> 2024-05-14 09:47:17 info [gcap.runASCAT]: minCounts set to 10
<gcap> 2024-05-14 09:47:17 info [gcap.runASCAT]: BED_file set to NA
<gcap> 2024-05-14 09:47:17 info [gcap.runASCAT]: probloci_file set to NA
<gcap> 2024-05-14 09:47:17 info [gcap.runASCAT]: chrom_names set to <1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22>
<gcap> 2024-05-14 09:47:17 info [gcap.runASCAT]: gender set to <XX>
<gcap> 2024-05-14 09:47:17 info [gcap.runASCAT]: min_base_qual set to 20
<gcap> 2024-05-14 09:47:17 info [gcap.runASCAT]: min_map_qual set to 35
<gcap> 2024-05-14 09:47:17 info [gcap.runASCAT]: penalty set to 70
<gcap> 2024-05-14 09:47:17 info [gcap.runASCAT]: skip_finished_ASCAT set to TRUE
<gcap> 2024-05-14 09:47:17 info [gcap.runASCAT]: 1 jobs detected
<gcap> 2024-05-14 09:47:17 info [gcap.runASCAT]: No ASCAT job to skip.
<gcap> 2024-05-14 09:47:17 info [FUN]: start submitting job S116655
<gcap> 2024-05-14 09:47:17 info [FUN]: tumor data file: /data3/wsx/gcap_debug/116655-072-R-AK7A15E12-WES.bwa.final.bam
<gcap> 2024-05-14 09:47:17 info [FUN]: normal data file: /data3/wsx/gcap_debug/116655_germline-WES.bwa.final.bam
<gcap> 2024-05-14 09:47:17 info [FUN]: tumor sample name: Test_T
<gcap> 2024-05-14 09:47:17 info [FUN]: normal sample name: Test_N
[W::hts_idx_load2] The index file is older than the data file: /data3/wsx/gcap_debug/116655-072-R-AK7A15E12-WES.bwa.final.bam.bai
[W::hts_idx_load2] The index file is older than the data file: /data3/wsx/gcap_debug/116655-072-R-AK7A15E12-WES.bwa.final.bam.bai
[W::hts_idx_load2] The index file is older than the data file: /data3/wsx/gcap_debug/116655-072-R-AK7A15E12-WES.bwa.final.bam.bai
[W::hts_idx_load2] The index file is older than the data file: /data3/wsx/gcap_debug/116655-072-R-AK7A15E12-WES.bwa.final.bam.bai
[W::hts_idx_load2] [W::hts_idx_load2] [W::hts_idx_load2] [W::hts_idx_load2] [W::hts_idx_load2] The index file is older than the data file: /data3/wsx/gcap_debug/116655-072-R-AK7A15E12-WES.bwa.final.bam.baiThe index file is older than the data file: /data3/wsx/gcap_debug/116655-072-R-AK7A15E12-WES.bwa.final.bam.baiThe index file is older than the data file: /data3/wsx/gcap_debug/116655-072-R-AK7A15E12-WES.bwa.final.bam.baiThe index file is older than the data file: /data3/wsx/gcap_debug/116655-072-R-AK7A15E12-WES.bwa.final.bam.baiThe index file is older than the data file: /data3/wsx/gcap_debug/116655-072-R-AK7A15E12-WES.bwa.final.bam.bai
[W::hts_idx_load2] [W::hts_idx_load2] [W::hts_idx_load2] The index file is older than the data file: /data3/wsx/gcap_debug/116655-072-R-AK7A15E12-WES.bwa.final.bam.baiThe index file is older than the data file: /data3/wsx/gcap_debug/116655-072-R-AK7A15E12-WES.bwa.final.bam.baiThe index file is older than the data file: /data3/wsx/gcap_debug/116655-072-R-AK7A15E12-WES.bwa.final.bam.bai
[W::hts_idx_load2] [W::hts_idx_load2] The index file is older than the data file: /data3/wsx/gcap_debug/116655-072-R-AK7A15E12-WES.bwa.final.bam.baiThe index file is older than the data file: /data3/wsx/gcap_debug/116655-072-R-AK7A15E12-WES.bwa.final.bam.bai
[W::hts_idx_load2]
The index file is older than the data file: /data3/wsx/gcap_debug/116655-072-R-AK7A15E12-WES.bwa.final.bam.bai
[W::hts_idx_load2] [W::hts_idx_load2] [W::hts_idx_load2] The index file is older than the data file: /data3/wsx/gcap_debug/116655-072-R-AK7A15E12-WES.bwa.final.bam.baiThe index file is older than the data file: /data3/wsx/gcap_debug/116655-072-R-AK7A15E12-WES.bwa.final.bam.baiThe index file is older than the data file: /data3/wsx/gcap_debug/116655-072-R-AK7A15E12-WES.bwa.final.bam.bai
[W::hts_idx_load2] The index file is older than the data file: /data3/wsx/gcap_debug/116655-072-R-AK7A15E12-WES.bwa.final.bam.bai
[W::hts_idx_load2] The index file is older than the data file: /data3/wsx/gcap_debug/116655-072-R-AK7A15E12-WES.bwa.final.bam.bai
[W::hts_idx_load2] The index file is older than the data file: /data3/wsx/gcap_debug/116655-072-R-AK7A15E12-WES.bwa.final.bam.bai
[W::hts_idx_load2] The index file is older than the data file: /data3/wsx/gcap_debug/116655-072-R-AK7A15E12-WES.bwa.final.bam.bai
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
[W::hts_idx_load2] [W::hts_idx_load2] [W::hts_idx_load2] The index file is older than the data file: /data3/wsx/gcap_debug/116655_germline-WES.bwa.final.bam.baiThe index file is older than the data file: /data3/wsx/gcap_debug/116655_germline-WES.bwa.final.bam.baiThe index file is older than the data file: /data3/wsx/gcap_debug/116655_germline-WES.bwa.final.bam.bai
[W::hts_idx_load2] The index file is older than the data file: /data3/wsx/gcap_debug/116655_germline-WES.bwa.final.bam.bai
[W::hts_idx_load2] The index file is older than the data file: /data3/wsx/gcap_debug/116655_germline-WES.bwa.final.bam.bai
[W::hts_idx_load2] The index file is older than the data file: /data3/wsx/gcap_debug/116655_germline-WES.bwa.final.bam.bai
[W::hts_idx_load2] [W::hts_idx_load2] The index file is older than the data file: /data3/wsx/gcap_debug/116655_germline-WES.bwa.final.bam.bai[W::hts_idx_load2] The index file is older than the data file: /data3/wsx/gcap_debug/116655_germline-WES.bwa.final.bam.bai
The index file is older than the data file: /data3/wsx/gcap_debug/116655_germline-WES.bwa.final.bam.bai[W::hts_idx_load2] The index file is older than the data file: /data3/wsx/gcap_debug/116655_germline-WES.bwa.final.bam.bai
[W::hts_idx_load2] The index file is older than the data file: /data3/wsx/gcap_debug/116655_germline-WES.bwa.final.bam.bai
[W::hts_idx_load2] The index file is older than the data file: /data3/wsx/gcap_debug/116655_germline-WES.bwa.final.bam.bai
[W::hts_idx_load2] The index file is older than the data file: /data3/wsx/gcap_debug/116655_germline-WES.bwa.final.bam.bai
[W::hts_idx_load2] The index file is older than the data file: /data3/wsx/gcap_debug/116655_germline-WES.bwa.final.bam.bai
[W::hts_idx_load2] [W::hts_idx_load2] The index file is older than the data file: /data3/wsx/gcap_debug/116655_germline-WES.bwa.final.bam.baiThe index file is older than the data file: /data3/wsx/gcap_debug/116655_germline-WES.bwa.final.bam.bai
[W::hts_idx_load2] The index file is older than the data file: /data3/wsx/gcap_debug/116655_germline-WES.bwa.final.bam.bai
[W::hts_idx_load2] The index file is older than the data file: /data3/wsx/gcap_debug/116655_germline-WES.bwa.final.bam.bai
[W::hts_idx_load2] The index file is older than the data file: /data3/wsx/gcap_debug/116655_germline-WES.bwa.final.bam.bai
[W::hts_idx_load2] The index file is older than the data file: /data3/wsx/gcap_debug/116655_germline-WES.bwa.final.bam.bai
[W::hts_idx_load2] The index file is older than the data file: /data3/wsx/gcap_debug/116655_germline-WES.bwa.final.bam.bai
[W::hts_idx_load2] The index file is older than the data file: /data3/wsx/gcap_debug/116655_germline-WES.bwa.final.bam.bai
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
[1] Reading Tumor LogR data...
[1] Reading Tumor BAF data...
[1] Reading Germline LogR data...
[1] Reading Germline BAF data...
[1] Registering SNP locations...
[1] Splitting genome in distinct chunks...
[1] Sample Test_T (1/1)
GC correlation: 25bp 0.049 ; 50bp 0.057 ; 100bp 0.064 ; 200bp 0.073 ; 500bp 0.087 ; 1kb 0.099 ; 2kb 0.109 ; 5kb 0.119 ; 10kb 0.125 ; 20kb 0.129 ; 50kb 0.135 ; 100kb 0.140 ; 200kb 0.144 ; 500kb 0.148 ; 1Mb 0.148 ; 2Mb 0.139 ; 5Mb 0.102 ; 10Mb 0.056 ;
Short window size: 1kb
Long window size: 100kb
Replication timing correlation: Bg02es 0.11 ; Bj 0.12 ; Gm06990 0.13 ; Gm12801 0.14 ; Gm12812 0.13 ; Gm12813 0.13 ; Gm12878 0.13 ; Helas3 0.12 ; Hepg2 0.14 ; Huvec 0.12 ; Imr90 0.12 ; K562 0.14 ; Mcf7 0.13 ; Nhek 0.13 ; Sknsh 0.14 ;
Replication dataset: Hepg2
[1] Plotting tumor data
[1] Plotting germline data
[1] Sample Test_T (1/1)
[1] Sample Test_T (1/1)
<gcap> 2024-05-14 10:22:57 info [doTryCatch]: job S116655 done
<gcap> 2024-05-14 10:22:57 info [gcap.runASCAT]: ASCAT analysis done, check /data3/wsx/gcap_debug/gcap_result for results
<gcap> 2024-05-14 10:22:57 info [gcap.workflow]: checking ASCAT result files
<gcap> 2024-05-14 10:22:57 info [gcap.workflow]: ============================================================
<gcap> 2024-05-14 10:22:57 info [gcap.workflow]: Step 2: Extract features and collapse features to gene level
<gcap> 2024-05-14 10:22:57 info [gcap.workflow]: ============================================================
<gcap> 2024-05-14 10:22:57 info [gcap.runBuildflow]: extracting sample-level and region-level features
<gcap> 2024-05-14 10:22:57 info [gcap.extractFeatures]: > Extract features from ASCAT results <
<gcap> 2024-05-14 10:22:57 info [gcap.extractFeatures]:
<gcap> 2024-05-14 10:22:57 info [gcap.extractFeatures]: reading ASCAT file list
reading ~/gcap_debug/gcap_result/S116655.ASCAT.rds
<gcap> 2024-05-14 10:22:59 info [gcap.extractFeatures]: using unique IDs from file names for avoid the sample name repetition
<gcap> 2024-05-14 10:22:59 info [gcap.extractFeatures]: back up default sample column to old_sample
<gcap> 2024-05-14 10:22:59 info [gcap.extractFeatures]: combining purity and ploidy info as data.frame
<gcap> 2024-05-14 10:22:59 info [gcap.extractFeatures]: generating CopyNumber object in sigminer package
ℹ [2024-05-14 10:22:59]: Started.
ℹ [2024-05-14 10:22:59]: Genome build : hg38.
ℹ [2024-05-14 10:22:59]: Genome measure: called.
ℹ [2024-05-14 10:22:59]: When add_loh is TRUE, use_all is forced to TRUE.
Please drop columns you don't want to keep before reading.
✔ [2024-05-14 10:22:59]: Chromosome size database for build obtained.
ℹ [2024-05-14 10:23:00]: Reading input.
✔ [2024-05-14 10:23:00]: A data frame as input detected.
✔ [2024-05-14 10:23:00]: Column names checked.
✔ [2024-05-14 10:23:00]: Column order set.
✔ [2024-05-14 10:23:00]: Chromosomes unified.
✔ [2024-05-14 10:23:00]: Data imported.
ℹ [2024-05-14 10:23:00]: Segments info:
ℹ [2024-05-14 10:23:00]: Keep - 569
ℹ [2024-05-14 10:23:00]: Drop - 0
✔ [2024-05-14 10:23:00]: Segments sorted.
ℹ [2024-05-14 10:23:00]: Adding LOH labels...
ℹ [2024-05-14 10:23:00]: Skipped joining adjacent segments with same copy number value.
✔ [2024-05-14 10:23:00]: Segmental table cleaned.
ℹ [2024-05-14 10:23:00]: Annotating.
✔ [2024-05-14 10:23:00]: Annotation done.
ℹ [2024-05-14 10:23:00]: Summarizing per sample.
✔ [2024-05-14 10:23:00]: Summarized.
ℹ [2024-05-14 10:23:00]: Generating CopyNumber object.
✔ [2024-05-14 10:23:00]: Generated.
ℹ [2024-05-14 10:23:00]: Validating object.
✔ [2024-05-14 10:23:00]: Done.
ℹ [2024-05-14 10:23:00]: 0.603 secs elapsed.
<gcap> 2024-05-14 10:23:00 info [gcap.extractFeatures]: estimating ploidy from copy number data
<gcap> 2024-05-14 10:23:00 info [gcap.extractFeatures]: checking if input data contains ploidy and if there are NAs should be overwritten
<gcap> 2024-05-14 10:23:00 info [gcap.extractFeatures]: getting Aneuploidy score
<gcap> 2024-05-14 10:23:00 info [gcap.extractFeatures]: getting pLOH score
<gcap> 2024-05-14 10:23:00 info [gcap.extractFeatures]: getting CNA burden
<gcap> 2024-05-14 10:23:00 info [gcap.extractFeatures]: generating copy number catalog matrix for fitting signature activity
<gcap> 2024-05-14 10:23:00 info [gcap.extractFeatures]: fitting copy number signature activity
<gcap> 2024-05-14 10:23:00 info [gcap.extractFeatures]: merging data
<gcap> 2024-05-14 10:23:00 info [gcap.extractFeatures]: feature extraction done
<gcap> 2024-05-14 10:23:00 info [gcap.extractFeatures]: now you can modify the result and append 'age' and 'gender' columns to the 'fts_sample' element of result list
<gcap> 2024-05-14 10:23:00 info [gcap.runBuildflow]: collapsing all data into gene-level prediction input
<gcap> 2024-05-14 10:23:00 info [gcap.collapse2Genes]: please make sure the first 3 columns of `fts$fts_region` are for chr, start, end.
<gcap> 2024-05-14 10:23:00 info [gcap.collapse2Genes]: collapsing region-level features to gene-level
<gcap> 2024-05-14 10:23:00 info [collapse_to_genes]: checking input chromosome names
<gcap> 2024-05-14 10:23:00 info [collapse_to_genes]: reading reference file /data3/wsx/R/x86_64-pc-linux-gnu-library/4.2/gcap/extdata/hg38_target_genes.rds
<gcap> 2024-05-14 10:23:01 info [collapse_to_genes]: finding overlaps
<gcap> 2024-05-14 10:23:01 info [collapse_to_genes]: calculating intersect size
<gcap> 2024-05-14 10:23:01 info [collapse_to_genes]: keeping records with >= 100% overlap ratio with a gene
<gcap> 2024-05-14 10:23:01 info [gcap.collapse2Genes]: merging gene-level and sample-level data
<gcap> 2024-05-14 10:23:01 info [gcap.collapse2Genes]: merging data and prior amplicon frequency data
<gcap> 2024-05-14 10:23:01 info [gcap.collapse2Genes]: done
<gcap> 2024-05-14 10:23:01 info [gcap.workflow]: =======================
<gcap> 2024-05-14 10:23:01 info [gcap.workflow]: Step 3: Run prediction
<gcap> 2024-05-14 10:23:01 info [gcap.workflow]: =======================
<gcap> 2024-05-14 10:23:01 info [gcap.runPrediction]: using model file XGB_NF11.rds
<gcap> 2024-05-14 10:23:01 info [gcap.runPrediction]: selecting necessary features from input data
<gcap> 2024-05-14 10:23:01 info [gcap.runPrediction]: running prediction
[10:23:01] WARNING: amalgamation/../src/c_api/c_api.cc:718: `ntree_limit` is deprecated, use `iteration_range` instead.
<gcap> 2024-05-14 10:23:01 info [gcap.workflow]: ====================================
<gcap> 2024-05-14 10:23:01 info [gcap.workflow]: Step 4: Run scoring and summarizing
<gcap> 2024-05-14 10:23:01 info [gcap.workflow]: ====================================
<gcap> 2024-05-14 10:23:01 info [gcap.runScoring]: checking input data type
<gcap> 2024-05-14 10:23:01 info [gcap.runScoring]: checking columns
<gcap> 2024-05-14 10:23:01 info [gcap.runScoring]: filtering out records without prob result
<gcap> 2024-05-14 10:23:01 info [gcap.runScoring]: joining extra annotation data
<gcap> 2024-05-14 10:23:04 info [gcap.runScoring]: only keep genes labeled as amplicons in result fCNA object
<gcap> 2024-05-14 10:23:04 info [gcap.runScoring]: No fCNA records detected
summarizing sample...
classifying samples with min_prob=0.6
done
======================
A <fCNA> object
record: 0
case: 1
|__ (0) 0 noncircular
|__ (0) 0 circular
======================
<gcap> 2024-05-14 10:23:04 info [gcap.runScoring]: done
<gcap> 2024-05-14 10:23:07 info [gcap.workflow]: Saving raw prediction result to ~/gcap_debug/gcap_result/S116655_prediction_result.rds
<gcap> 2024-05-14 10:23:07 info [gcap.workflow]: Saving fCNA records and sample info to ~/gcap_debug/gcap_result/S116655_fCNA_records.csv, ~/gcap_debug/gcap_result/S116655_sample_info.csv
<gcap> 2024-05-14 10:23:07 info [gcap.workflow]: =======================================
<gcap> 2024-05-14 10:23:07 info [gcap.workflow]: Done! Thanks for using GCAP workflow
<gcap> 2024-05-14 10:23:07 info [gcap.workflow]: =======================================
There were 50 or more warnings (use warnings() to see the first 50)
> warnings()
Warning messages:
1: In !is.null(homsegs) && !is.na(homsegs) :
'length(x) = 30 > 1' in coercion to 'logical(1)'
2: In !is.na(dif) && sum(dif > 0.3) > 5 :
'length(x) = 3974 > 1' in coercion to 'logical(1)'
3: In !is.na(dif) && sum(dif > 0.3) > 5 :
'length(x) = 8925 > 1' in coercion to 'logical(1)'
4: In !is.na(dif) && sum(dif > 0.3) > 5 :
'length(x) = 8005 > 1' in coercion to 'logical(1)'
5: In !is.na(dif) && sum(dif > 0.3) > 5 :
'length(x) = 7126 > 1' in coercion to 'logical(1)'
6: In !is.na(dif) && sum(dif > 0.3) > 5 :
'length(x) = 6874 > 1' in coercion to 'logical(1)'
7: In !is.na(dif) && sum(dif > 0.3) > 5 :
'length(x) = 5988 > 1' in coercion to 'logical(1)'
8: In !is.na(dif) && sum(dif > 0.3) > 5 :
'length(x) = 6012 > 1' in coercion to 'logical(1)'
9: In !is.na(dif) && sum(dif > 0.3) > 5 :
'length(x) = 6662 > 1' in coercion to 'logical(1)'
10: In !is.na(dif) && sum(dif > 0.3) > 5 :
'length(x) = 6022 > 1' in coercion to 'logical(1)'
11: In !is.na(dif) && sum(dif > 0.3) > 5 :
'length(x) = 6309 > 1' in coercion to 'logical(1)'
12: In !is.null(homsegs) && !is.na(homsegs) :
'length(x) = 21 > 1' in coercion to 'logical(1)'
13: In !is.na(dif) && sum(dif > 0.3) > 5 :
'length(x) = 4753 > 1' in coercion to 'logical(1)'
14: In !is.na(dif) && sum(dif > 0.3) > 5 :
'length(x) = 6714 > 1' in coercion to 'logical(1)'
15: In !is.na(dif) && sum(dif > 0.3) > 5 :
'length(x) = 5472 > 1' in coercion to 'logical(1)'
16: In !is.na(dif) && sum(dif > 0.3) > 5 :
'length(x) = 5830 > 1' in coercion to 'logical(1)'
17: In !is.na(dif) && sum(dif > 0.3) > 5 :
'length(x) = 9601 > 1' in coercion to 'logical(1)'
18: In !is.na(dif) && sum(dif > 0.3) > 5 :
'length(x) = 4096 > 1' in coercion to 'logical(1)'
19: In !is.na(dif) && sum(dif > 0.3) > 5 :
'length(x) = 5412 > 1' in coercion to 'logical(1)'
20: In !is.null(homsegs) && !is.na(homsegs) :
'length(x) = 27 > 1' in coercion to 'logical(1)'
21: In !is.na(dif) && sum(dif > 0.3) > 5 :
'length(x) = 7386 > 1' in coercion to 'logical(1)'
22: In !is.na(dif) && sum(dif > 0.3) > 5 :
'length(x) = 9412 > 1' in coercion to 'logical(1)'
23: In !is.na(dif) && sum(dif > 0.3) > 5 :
'length(x) = 4836 > 1' in coercion to 'logical(1)'
24: In !is.na(dif) && sum(dif > 0.3) > 5 :
'length(x) = 3798 > 1' in coercion to 'logical(1)'
25: In !is.na(dif) && sum(dif > 0.3) > 5 :
'length(x) = 7405 > 1' in coercion to 'logical(1)'
26: In !is.na(dif) && sum(dif > 0.3) > 5 :
'length(x) = 8696 > 1' in coercion to 'logical(1)'
27: In !is.na(dif) && sum(dif > 0.3) > 5 :
'length(x) = 5948 > 1' in coercion to 'logical(1)'
28: In !is.na(dif) && sum(dif > 0.3) > 5 :
'length(x) = 4103 > 1' in coercion to 'logical(1)'
29: In !is.na(dif) && sum(dif > 0.3) > 5 :
'length(x) = 4321 > 1' in coercion to 'logical(1)'
30: In !is.null(homsegs) && !is.na(homsegs) :
'length(x) = 36 > 1' in coercion to 'logical(1)'
31: In !is.na(dif) && sum(dif > 0.3) > 5 :
'length(x) = 10087 > 1' in coercion to 'logical(1)'
32: In !is.na(dif) && sum(dif > 0.3) > 5 :
'length(x) = 4624 > 1' in coercion to 'logical(1)'
33: In !is.na(dif) && sum(dif > 0.3) > 5 :
'length(x) = 3991 > 1' in coercion to 'logical(1)'
34: In !is.na(dif) && sum(dif > 0.3) > 5 :
'length(x) = 4482 > 1' in coercion to 'logical(1)'
35: In !is.na(dif) && sum(dif > 0.3) > 5 :
'length(x) = 8284 > 1' in coercion to 'logical(1)'
36: In !is.na(dif) && sum(dif > 0.3) > 5 :
'length(x) = 6745 > 1' in coercion to 'logical(1)'
37: In !is.na(dif) && sum(dif > 0.3) > 5 :
'length(x) = 11550 > 1' in coercion to 'logical(1)'
38: In !is.na(dif) && sum(dif > 0.3) > 5 :
'length(x) = 3906 > 1' in coercion to 'logical(1)'
39: In !is.na(dif) && sum(dif > 0.3) > 5 :
'length(x) = 4429 > 1' in coercion to 'logical(1)'
40: In !is.na(dif) && sum(dif > 0.3) > 5 :
'length(x) = 5561 > 1' in coercion to 'logical(1)'
41: In !is.na(dif) && sum(dif > 0.3) > 5 :
'length(x) = 5334 > 1' in coercion to 'logical(1)'
42: In !is.na(dif) && sum(dif > 0.3) > 5 :
'length(x) = 4022 > 1' in coercion to 'logical(1)'
43: In !is.null(homsegs) && !is.na(homsegs) :
'length(x) = 15 > 1' in coercion to 'logical(1)'
44: In !is.na(dif) && sum(dif > 0.3) > 5 :
'length(x) = 7122 > 1' in coercion to 'logical(1)'
45: In !is.na(dif) && sum(dif > 0.3) > 5 :
'length(x) = 4058 > 1' in coercion to 'logical(1)'
46: In !is.na(dif) && sum(dif > 0.3) > 5 :
'length(x) = 5002 > 1' in coercion to 'logical(1)'
47: In !is.na(dif) && sum(dif > 0.3) > 5 :
'length(x) = 4824 > 1' in coercion to 'logical(1)'
48: In !is.na(dif) && sum(dif > 0.3) > 5 :
'length(x) = 9124 > 1' in coercion to 'logical(1)'
49: In !is.null(homsegs) && !is.na(homsegs) :
'length(x) = 21 > 1' in coercion to 'logical(1)'
50: In !is.na(dif) && sum(dif > 0.3) > 5 :
'length(x) = 6755 > 1' in coercion to 'logical(1)'
@ShixiangWang thanks for running my files. I followed your instructions and ran through hg38 genome. However, I got an error again (see below log).
One thing I noticed is "gcap" package.
Your is ShixiangWang/gcap@1f1dbd2 Mine is ShixiangWang/gcap@cdfb1c7
Would this cause this error? Also, I'm including my session info and log messages below. Any suggestion would be much appreciated.
> devtools::session_info()
─ Session info ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
setting value
version R version 4.3.3 (2024-02-29)
os Oracle Linux Server 8.9
system x86_64, linux-gnu
ui X11
language (EN)
collate en_US.UTF-8
ctype en_US.UTF-8
tz America/New_York
date 2024-05-14
pandoc 2.0.6 @ /usr/bin/pandoc
─ Packages ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
package * version date (UTC) lib source
ASCAT * 3.0.0 2024-05-06 [2] Github (ShixiangWang/ascat@51fd695)
Biobase * 2.62.0 2023-10-24 [2] Bioconductor
BiocGenerics * 0.48.1 2023-11-01 [2] Bioconductor
bitops 1.0-7 2021-04-24 [2] CRAN (R 4.3.3)
cachem 1.0.8 2023-05-01 [2] CRAN (R 4.3.3)
cli 3.6.2 2023-12-11 [2] CRAN (R 4.3.3)
cluster 2.1.6 2023-12-01 [3] CRAN (R 4.3.3)
codetools 0.2-19 2023-02-01 [3] CRAN (R 4.3.3)
colorspace 2.1-0 2023-01-23 [2] CRAN (R 4.3.3)
crayon 1.5.2 2022-09-29 [2] CRAN (R 4.3.3)
data.table 1.15.4 2024-03-30 [2] CRAN (R 4.3.3)
devtools 2.4.5 2022-10-11 [2] CRAN (R 4.3.3)
digest 0.6.35 2024-03-11 [2] CRAN (R 4.3.3)
doParallel * 1.0.17 2022-02-07 [2] CRAN (R 4.3.3)
dplyr 1.1.4 2023-11-17 [2] CRAN (R 4.3.3)
ellipsis 0.3.2 2021-04-29 [2] CRAN (R 4.3.3)
fansi 1.0.6 2023-12-08 [2] CRAN (R 4.3.3)
fastmap 1.1.1 2023-02-24 [2] CRAN (R 4.3.3)
foreach * 1.5.2 2022-02-02 [2] CRAN (R 4.3.3)
fs 1.6.4 2024-04-25 [2] CRAN (R 4.3.3)
furrr 0.3.1 2022-08-15 [2] CRAN (R 4.3.3)
future 1.33.2 2024-03-26 [2] CRAN (R 4.3.3)
gcap * 1.1.4 2024-05-01 [2] Github (ShixiangWang/gcap@cdfb1c7)
generics 0.1.3 2022-07-05 [2] CRAN (R 4.3.3)
GenomeInfoDb * 1.38.8 2024-03-15 [2] Bioconductor 3.18 (R 4.3.3)
GenomeInfoDbData 1.2.11 2024-04-30 [2] Bioconductor
GenomicRanges * 1.54.1 2023-10-29 [2] Bioconductor
GetoptLong 1.0.5 2020-12-15 [2] CRAN (R 4.3.3)
ggplot2 3.5.1 2024-04-23 [2] CRAN (R 4.3.3)
GlobalOptions 0.1.2 2020-06-10 [2] CRAN (R 4.3.3)
globals 0.16.3 2024-03-08 [2] CRAN (R 4.3.3)
glue 1.7.0 2024-01-09 [2] CRAN (R 4.3.3)
gridBase 0.4-7 2014-02-24 [2] CRAN (R 4.3.3)
gtable 0.3.5 2024-04-22 [2] CRAN (R 4.3.3)
hms 1.1.3 2023-03-21 [2] CRAN (R 4.3.3)
htmltools 0.5.8.1 2024-04-04 [2] CRAN (R 4.3.3)
htmlwidgets 1.6.4 2023-12-06 [2] CRAN (R 4.3.3)
httpuv 1.6.15 2024-03-26 [2] CRAN (R 4.3.3)
IRanges * 2.36.0 2023-10-24 [2] Bioconductor
iterators * 1.0.14 2022-02-05 [2] CRAN (R 4.3.3)
jsonlite 1.8.8 2023-12-04 [2] CRAN (R 4.3.3)
later 1.3.2 2023-12-06 [2] CRAN (R 4.3.3)
lattice 0.22-5 2023-10-24 [3] CRAN (R 4.3.3)
lgr 0.4.4 2022-09-05 [2] CRAN (R 4.3.3)
lifecycle 1.0.4 2023-11-07 [2] CRAN (R 4.3.3)
listenv 0.9.1 2024-01-29 [2] CRAN (R 4.3.3)
magrittr 2.0.3 2022-03-30 [2] CRAN (R 4.3.3)
Matrix 1.6-5 2024-01-11 [3] CRAN (R 4.3.3)
memoise 2.0.1 2021-11-26 [2] CRAN (R 4.3.3)
mime 0.12 2021-09-28 [2] CRAN (R 4.3.3)
miniUI 0.1.1.1 2018-05-18 [2] CRAN (R 4.3.3)
munsell 0.5.1 2024-04-01 [2] CRAN (R 4.3.3)
NMF 0.27 2024-02-08 [2] CRAN (R 4.3.3)
parallelly 1.37.1 2024-02-29 [2] CRAN (R 4.3.3)
pillar 1.9.0 2023-03-22 [2] CRAN (R 4.3.3)
pkgbuild 1.4.4 2024-03-17 [2] CRAN (R 4.3.3)
pkgconfig 2.0.3 2019-09-22 [2] CRAN (R 4.3.3)
pkgload 1.3.4 2024-01-16 [2] CRAN (R 4.3.3)
plyr 1.8.9 2023-10-02 [2] CRAN (R 4.3.3)
profvis 0.3.8 2023-05-02 [2] CRAN (R 4.3.3)
promises 1.3.0 2024-04-05 [2] CRAN (R 4.3.3)
purrr 1.0.2 2023-08-10 [2] CRAN (R 4.3.3)
quadprog 1.5-8 2019-11-20 [2] CRAN (R 4.3.3)
R6 2.5.1 2021-08-19 [2] CRAN (R 4.3.3)
rappdirs 0.3.3 2021-01-31 [2] CRAN (R 4.3.3)
RColorBrewer * 1.1-3 2022-04-03 [2] CRAN (R 4.3.3)
Rcpp 1.0.12 2024-01-09 [2] CRAN (R 4.3.3)
RCurl 1.98-1.14 2024-01-09 [2] CRAN (R 4.3.3)
readr * 2.1.5 2024-01-10 [2] CRAN (R 4.3.3)
registry 0.5-1 2019-03-05 [2] CRAN (R 4.3.3)
remotes 2.5.0 2024-03-17 [2] CRAN (R 4.3.3)
reshape2 1.4.4 2020-04-09 [2] CRAN (R 4.3.3)
rjson 0.2.21 2022-01-09 [2] CRAN (R 4.3.3)
rlang 1.1.3 2024-01-10 [2] CRAN (R 4.3.3)
rngtools 1.5.2 2021-09-20 [2] CRAN (R 4.3.3)
S4Vectors * 0.40.2 2023-11-23 [2] Bioconductor 3.18 (R 4.3.3)
scales 1.3.0 2023-11-28 [2] CRAN (R 4.3.3)
sessioninfo 1.2.2 2021-12-06 [2] CRAN (R 4.3.3)
shiny 1.8.1.1 2024-04-02 [2] CRAN (R 4.3.3)
sigminer * 2.3.0 2023-12-12 [2] CRAN (R 4.3.3)
stringi 1.8.3 2023-12-11 [2] CRAN (R 4.3.3)
stringr 1.5.1 2023-11-14 [2] CRAN (R 4.3.3)
tibble 3.2.1 2023-03-20 [2] CRAN (R 4.3.3)
tidyselect 1.2.1 2024-03-11 [2] CRAN (R 4.3.3)
tzdb 0.4.0 2023-05-12 [2] CRAN (R 4.3.3)
urlchecker 1.0.1 2021-11-30 [2] CRAN (R 4.3.3)
usethis 2.2.3 2024-02-19 [2] CRAN (R 4.3.3)
utf8 1.2.4 2023-10-22 [2] CRAN (R 4.3.3)
uuid 1.2-0 2024-01-14 [2] CRAN (R 4.3.3)
vctrs 0.6.5 2023-12-01 [2] CRAN (R 4.3.3)
xgboost 1.5.2.1 2022-02-21 [2] CRAN (R 4.3.3)
xtable 1.8-4 2019-04-21 [2] CRAN (R 4.3.3)
XVector 0.42.0 2023-10-24 [2] Bioconductor
zlibbioc 1.48.2 2024-03-13 [2] Bioconductor 3.18 (R 4.3.3)
[1] /mnt/nasapps/production/R/4.3.2
[2] /home/changtn/R/x86_64-redhat-linux-gnu-library/4.3
[3] /usr/lib64/R/library
[4] /usr/share/R/library
Loading required package: ASCAT
Loading required package: RColorBrewer
Loading required package: splines
Loading required package: readr
Loading required package: GenomicRanges
Loading required package: stats4
Loading required package: BiocGenerics
Attaching package: ‘BiocGenerics’
The following objects are masked from ‘package:stats’:
IQR, mad, sd, var, xtabs
The following objects are masked from ‘package:base’:
anyDuplicated, aperm, append, as.data.frame, basename, cbind,
colnames, dirname, do.call, duplicated, eval, evalq, Filter, Find,
get, grep, grepl, intersect, is.unsorted, lapply, Map, mapply,
match, mget, order, paste, pmax, pmax.int, pmin, pmin.int,
Position, rank, rbind, Reduce, rownames, sapply, setdiff, sort,
table, tapply, union, unique, unsplit, which.max, which.min
Loading required package: S4Vectors
Attaching package: ‘S4Vectors’
The following object is masked from ‘package:utils’:
findMatches
The following objects are masked from ‘package:base’:
expand.grid, I, unname
Loading required package: IRanges
Loading required package: GenomeInfoDb
Loading required package: parallel
Loading required package: doParallel
Loading required package: foreach
Loading required package: iterators
Loading required package: sigminer
sigminer version 2.3.0
- Star me at https://github.com/ShixiangWang/sigminer
- Run hello() to see usage and citation.
gcap version 1.1.4
- Project URL at https://github.com/ShixiangWang/gcap
Citation:
Wang, S., Wu, CY., He, MM. et al. Machine learning-based extrachromosomal DNA identification in
large-scale cohorts reveals its clinical implications in cancer. Nat Commun 15, 1515 (2024). https://doi.org/10.1038/s41467-024-45479-6
<gcap> 2024-05-14 14:23:33.91281 info [gcap.workflow]: =====================
<gcap> 2024-05-14 14:23:33.994768 info [gcap.workflow]: GCAP WORKFLOW
<gcap> 2024-05-14 14:23:34.010181 info [gcap.workflow]: =====================
<gcap> 2024-05-14 14:23:34.013962 info [gcap.workflow]:
<gcap> 2024-05-14 14:23:34.017747 info [gcap.workflow]: =====================
<gcap> 2024-05-14 14:23:34.021299 info [gcap.workflow]: Step 1: Run ASCAT 3.0
<gcap> 2024-05-14 14:23:34.025109 info [gcap.workflow]: =====================
<gcap> 2024-05-14 14:23:34.053504 info [gcap.runASCAT]: > Run ASCAT on WES data <
<gcap> 2024-05-14 14:23:34.05755 info [gcap.runASCAT]:
<gcap> 2024-05-14 14:23:34.061144 info [gcap.runASCAT]: Configs:
<gcap> 2024-05-14 14:23:34.064778 info [gcap.runASCAT]: result path set to /mnt/MoCha-NGS/active/MoCha-NGS_BW_transfers/ecDNA/output
<gcap> 2024-05-14 14:23:34.068447 info [gcap.runASCAT]: allelecounter_exe set to ~/miniconda3/envs/cancerit/bin/alleleCounter
<gcap> 2024-05-14 14:23:34.072628 info [gcap.runASCAT]: g1000allelesprefix set to /mnt/MoCha-NGS/active/MoCha-NGS_BW_transfers/ecDNA/ref/1000G_loci_hg38//1kg.phase3.v5a_GRCh38nounref_allele_index_chr
<gcap> 2024-05-14 14:23:34.076255 info [gcap.runASCAT]: g1000lociprefix set to /mnt/MoCha-NGS/active/MoCha-NGS_BW_transfers/ecDNA/ref/1000G_loci_hg38//1kg.phase3.v5a_GRCh38nounref_loci_chrstring_chr
<gcap> 2024-05-14 14:23:34.079994 info [gcap.runASCAT]: GCcontentfile set to /mnt/MoCha-NGS/active/MoCha-NGS_BW_transfers/ecDNA/ref/GC_correction_hg38.txt
<gcap> 2024-05-14 14:23:34.083808 info [gcap.runASCAT]: replictimingfile set to /mnt/MoCha-NGS/active/MoCha-NGS_BW_transfers/ecDNA/ref/RT_correction_hg38.txt
<gcap> 2024-05-14 14:23:34.087556 info [gcap.runASCAT]: nthreads set to 22
<gcap> 2024-05-14 14:23:34.09125 info [gcap.runASCAT]: minCounts set to 10
<gcap> 2024-05-14 14:23:34.095211 info [gcap.runASCAT]: BED_file set to NA
<gcap> 2024-05-14 14:23:34.099086 info [gcap.runASCAT]: probloci_file set to NA
<gcap> 2024-05-14 14:23:34.103024 info [gcap.runASCAT]: chrom_names set to <1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22>
<gcap> 2024-05-14 14:23:34.106693 info [gcap.runASCAT]: gender set to <XX>
<gcap> 2024-05-14 14:23:34.110381 info [gcap.runASCAT]: min_base_qual set to 20
<gcap> 2024-05-14 14:23:34.114615 info [gcap.runASCAT]: min_map_qual set to 35
<gcap> 2024-05-14 14:23:34.11824 info [gcap.runASCAT]: penalty set to 70
<gcap> 2024-05-14 14:23:34.122078 info [gcap.runASCAT]: skip_finished_ASCAT set to TRUE
<gcap> 2024-05-14 14:23:34.130545 info [gcap.runASCAT]: 1 jobs detected
<gcap> 2024-05-14 14:23:34.134146 info [gcap.runASCAT]: No ASCAT job to skip.
<gcap> 2024-05-14 14:23:34.137871 info [FUN]: start submitting job S116655
<gcap> 2024-05-14 14:23:34.142185 info [FUN]: tumor data file: /mnt/MoCha-NGS/active/MoCha-NGS_BW_transfers/ecDNA/116655-072-R-AK7A15E12-WES.bwa.final.bam
<gcap> 2024-05-14 14:23:34.146052 info [FUN]: normal data file: /mnt/MoCha-NGS/active/MoCha-NGS_BW_transfers/ecDNA/116655_germline-WES.bwa.final.bam
<gcap> 2024-05-14 14:23:34.149747 info [FUN]: tumor sample name: Test_T
<gcap> 2024-05-14 14:23:34.153461 info [FUN]: normal sample name: Test_N
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Reading locis
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
Done reading locis
Multi pos start:
[1] Reading Tumor LogR data...
[1] Reading Tumor BAF data...
[1] Reading Germline LogR data...
[1] Reading Germline BAF data...
[1] Registering SNP locations...
[1] Splitting genome in distinct chunks...
[1] Sample Test_T (1/1)
GC correlation: 25bp 0.049 ; 50bp 0.057 ; 100bp 0.064 ; 200bp 0.073 ; 500bp 0.087 ; 1kb 0.099 ; 2kb 0.109 ; 5kb 0.119 ; 10kb 0.125 ; 20kb 0.129 ; 50kb 0.135 ; 100kb 0.140 ; 200kb 0.144 ; 500kb 0.148 ; 1Mb 0.148 ; 2Mb 0.139 ; 5Mb 0.102 ; 10Mb 0.056 ;
Short window size: 1kb
Long window size: 100kb
Replication timing correlation: Bg02es 0.11 ; Bj 0.12 ; Gm06990 0.13 ; Gm12801 0.14 ; Gm12812 0.13 ; Gm12813 0.13 ; Gm12878 0.13 ; Helas3 0.12 ; Hepg2 0.14 ; Huvec 0.12 ; Imr90 0.12 ; K562 0.14 ; Mcf7 0.13 ; Nhek 0.13 ; Sknsh 0.14 ;
Replication dataset: Hepg2
[1] Plotting tumor data
[1] Plotting germline data
[1] Sample Test_T (1/1)
<gcap> 2024-05-14 14:59:49.008591 fatal [value[[3L]]]: job S116655 failed in ASCAT due to following error
<gcap> 2024-05-14 14:59:49.022772 info [value[[3L]]]: 'length = 30' in coercion to 'logical(1)'
<gcap> 2024-05-14 14:59:49.027022 info [value[[3L]]]: =====
<gcap> 2024-05-14 14:59:49.031 info [value[[3L]]]: Please check your input bam files (if missing bam index? if its alignment quality is lower?)
<gcap> 2024-05-14 14:59:49.034931 info [value[[3L]]]: =====
<gcap> 2024-05-14 14:59:49.039372 info [gcap.runASCAT]: ASCAT analysis done, check /mnt/MoCha-NGS/active/MoCha-NGS_BW_transfers/ecDNA/output for results
<gcap> 2024-05-14 14:59:49.045842 info [gcap.workflow]: checking ASCAT result files
<gcap> 2024-05-14 14:59:49.049786 warn [FUN]: result file /mnt/MoCha-NGS/active/MoCha-NGS_BW_transfers/ecDNA/output/S116655.ASCAT.rds does not exist, the corresponding ASCAT calling has error occurred
<gcap> 2024-05-14 14:59:49.054895 warn [FUN]: /mnt/MoCha-NGS/active/MoCha-NGS_BW_transfers/ecDNA/output/S116655.ASCAT.rds contains a failed ASCAT job, will discard it before next step
<gcap> 2024-05-14 14:59:49.058794 fatal [gcap.workflow]: no sucessful ASCAT result file to proceed!
<gcap> 2024-05-14 14:59:49.062439 fatal [gcap.workflow]: check your ASCAT setting before make sure this case could not be used!
Error in gcap.workflow(tumourseqfile = "/mnt/MoCha-NGS/active/MoCha-NGS_BW_transfers/ecDNA/116655-072-R-AK7A15E12-WES.bwa.final.bam", :
Execution halted
@tingchiafelix You may need to run GCAP on R<4.3. The fixed ASCAT version is incompatible with the newer R release.
I will take some time to update the ASCAT code to make them compatible.
@tingchiafelix I also updated the gcap to be compatible with the latest version of ASCAT (it could be run on R4.3). If you are interested, please follow the instructions at https://github.com/ShixiangWang/gcap?tab=readme-ov-file#install-ascat-required.
I have tested the new version with your provided data.
gcap.workflow(
tumourseqfile = "~/gcap_debug/116655-072-R-AK7A15E12-WES.bwa.final.bam",
normalseqfile = "~/gcap_debug/116655_germline-WES.bwa.final.bam",
tumourname = "Test_T",
normalname = "Test_N",
jobname = "S116655",
outdir = "~/gcap_debug/gcap_result_ascat",
allelecounter_exe = "~/miniconda3/envs/cancerit/bin/alleleCounter",
g1000allelesprefix = file.path(
"~/share/gcap_reference/1000G_loci_hg38/",
"1kg.phase3.v5a_GRCh38nounref_allele_index_chr"
),
g1000lociprefix = file.path("~/share/gcap_reference/1000G_loci_hg38/",
"1kg.phase3.v5a_GRCh38nounref_loci_chrstring_chr"
),
GCcontentfile = "~/share/gcap_reference/GC_G1000_hg38.txt",
replictimingfile = "~/share/gcap_reference/RT_G1000_hg38.txt",
skip_finished_ASCAT = TRUE,
skip_ascat_call = FALSE,
result_file_prefix = "S116655",
genome_build = "hg38",
model = "XGB11"
)
Correction files must be updated.
I am closing this issue now. Please file another issue if you got further questions.
@ShixiangWang appreciated your help! The process was done successfully following your workflow and suggestion (hg38 workflow). However, my BAM file was generated by hg19 reference genome. Thus, I did try the same workflow but replaced it with the hg19 reference files that you have provided, but it looks like the process failed with an error (length(ovl) > nrow(ASCATobj$Tumor_LogR)/10 is not TRUE). I have installed R 4.2.2 version.
Could you please have a look?
library(gcap)
# hg19 ----------------
gcap.workflow(
tumourseqfile = "/mnt/MoCha-NGS/active/MoCha-NGS_BW_transfers/ecDNA/116655-072-R-AK7A15E12-WES.bwa.final.bam",
normalseqfile = "/mnt/MoCha-NGS/active/MoCha-NGS_BW_transfers/ecDNA/116655_germline-WES.bwa.final.bam",
tumourname = "Test_T",
normalname = "Test_N",
jobname = "S116655",
outdir = "/mnt/MoCha-NGS/active/MoCha-NGS_BW_transfers/ecDNA/output",
allelecounter_exe = "~/miniconda3/envs/ecDNA/bin/alleleCounter",
g1000allelesprefix = file.path(
"/mnt/MoCha-NGS/active/MoCha-NGS_BW_transfers/ecDNA/ref/1000G_loci_hg19/",
"1000genomesAlleles2012_chr"
),
g1000lociprefix = file.path("/mnt/MoCha-NGS/active/MoCha-NGS_BW_transfers/ecDNA/ref/1000G_loci_hg19/",
"1000genomesloci2012chrstring_chr"),
GCcontentfile = "/mnt/MoCha-NGS/active/MoCha-NGS_BW_transfers/ecDNA/ref/GC_correction_updated_hg19.txt",
replictimingfile = "/mnt/MoCha-NGS/active/MoCha-NGS_BW_transfers/ecDNA/ref/RT_correction_updated_hg19.txt",
skip_finished_ASCAT = TRUE,
skip_ascat_call = FALSE,
result_file_prefix = "S116655",
genome_build = "hg19",
model = "XGB11"
)
Loading required package: ASCAT
Loading required package: RColorBrewer
Loading required package: splines
Loading required package: readr
Loading required package: GenomicRanges
Loading required package: stats4
Loading required package: BiocGenerics
Attaching package: ‘BiocGenerics’
The following objects are masked from ‘package:stats’:
IQR, mad, sd, var, xtabs
The following objects are masked from ‘package:base’:
anyDuplicated, aperm, append, as.data.frame, basename, cbind,
colnames, dirname, do.call, duplicated, eval, evalq, Filter, Find,
get, grep, grepl, intersect, is.unsorted, lapply, Map, mapply,
match, mget, order, paste, pmax, pmax.int, pmin, pmin.int,
Position, rank, rbind, Reduce, rownames, sapply, setdiff, sort,
table, tapply, union, unique, unsplit, which.max, which.min
Loading required package: S4Vectors
Attaching package: ‘S4Vectors’
The following objects are masked from ‘package:base’:
expand.grid, I, unname
Loading required package: IRanges
Loading required package: GenomeInfoDb
Loading required package: parallel
Loading required package: doParallel
Loading required package: foreach
Loading required package: iterators
Loading required package: sigminer
Registered S3 method overwritten by 'sigminer':
method from
print.bytes Rcpp
sigminer version 2.3.1
- Star me at https://github.com/ShixiangWang/sigminer
- Run hello() to see usage and citation.
gcap version 1.2.0
- Project URL at https://github.com/ShixiangWang/gcap
Citation:
Wang, S., Wu, CY., He, MM. et al. Machine learning-based extrachromosomal DNA identification in
large-scale cohorts reveals its clinical implications in cancer. Nat Commun 15, 1515 (2024). https://doi.org/10.1038/s41467-024-45479-6
<gcap> 2024-05-16 12:37:34 info [gcap.workflow]: =====================
<gcap> 2024-05-16 12:37:34 info [gcap.workflow]: GCAP WORKFLOW
<gcap> 2024-05-16 12:37:34 info [gcap.workflow]: =====================
<gcap> 2024-05-16 12:37:34 info [gcap.workflow]:
<gcap> 2024-05-16 12:37:34 info [gcap.workflow]: =====================
<gcap> 2024-05-16 12:37:34 info [gcap.workflow]: Step 1: Run ASCAT 3.0
<gcap> 2024-05-16 12:37:34 info [gcap.workflow]: =====================
<gcap> 2024-05-16 12:37:34 info [gcap.runASCAT]: > Run ASCAT on WES data <
<gcap> 2024-05-16 12:37:34 info [gcap.runASCAT]:
<gcap> 2024-05-16 12:37:34 info [gcap.runASCAT]: Configs:
<gcap> 2024-05-16 12:37:34 info [gcap.runASCAT]: result path set to /mnt/MoCha-NGS/active/MoCha-NGS_BW_transfers/ecDNA/output
<gcap> 2024-05-16 12:37:34 info [gcap.runASCAT]: allelecounter_exe set to ~/miniconda3/envs/ecDNA/bin/alleleCounter
<gcap> 2024-05-16 12:37:34 info [gcap.runASCAT]: g1000allelesprefix set to /mnt/MoCha-NGS/active/MoCha-NGS_BW_transfers/ecDNA/ref/1000G_loci_hg19//1000genomesAlleles2012_chr
<gcap> 2024-05-16 12:37:34 info [gcap.runASCAT]: g1000lociprefix set to /mnt/MoCha-NGS/active/MoCha-NGS_BW_transfers/ecDNA/ref/1000G_loci_hg19//1000genomesloci2012chrstring_chr
<gcap> 2024-05-16 12:37:34 info [gcap.runASCAT]: GCcontentfile set to /mnt/MoCha-NGS/active/MoCha-NGS_BW_transfers/ecDNA/ref/GC_correction_updated_hg19.txt
<gcap> 2024-05-16 12:37:34 info [gcap.runASCAT]: replictimingfile set to /mnt/MoCha-NGS/active/MoCha-NGS_BW_transfers/ecDNA/ref/RT_correction_updated_hg19.txt
<gcap> 2024-05-16 12:37:34 info [gcap.runASCAT]: nthreads set to 22
<gcap> 2024-05-16 12:37:34 info [gcap.runASCAT]: minCounts set to 10
<gcap> 2024-05-16 12:37:34 info [gcap.runASCAT]: BED_file set to NA
<gcap> 2024-05-16 12:37:34 info [gcap.runASCAT]: probloci_file set to NA
<gcap> 2024-05-16 12:37:34 info [gcap.runASCAT]: chrom_names set to <1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22>
<gcap> 2024-05-16 12:37:34 info [gcap.runASCAT]: gender set to <XX>
<gcap> 2024-05-16 12:37:34 info [gcap.runASCAT]: min_base_qual set to 20
<gcap> 2024-05-16 12:37:34 info [gcap.runASCAT]: min_map_qual set to 35
<gcap> 2024-05-16 12:37:34 info [gcap.runASCAT]: penalty set to 70
<gcap> 2024-05-16 12:37:34 info [gcap.runASCAT]: skip_finished_ASCAT set to TRUE
<gcap> 2024-05-16 12:37:34 info [gcap.runASCAT]: 1 jobs detected
<gcap> 2024-05-16 12:37:34 info [gcap.runASCAT]: No ASCAT job to skip.
<gcap> 2024-05-16 12:37:34 info [FUN]: start submitting job S116655
<gcap> 2024-05-16 12:37:34 info [FUN]: tumor data file: /mnt/MoCha-NGS/active/MoCha-NGS_BW_transfers/ecDNA/116655-072-R-AK7A15E12-WES.bwa.final.bam
<gcap> 2024-05-16 12:37:34 info [FUN]: normal data file: /mnt/MoCha-NGS/active/MoCha-NGS_BW_transfers/ecDNA/116655_germline-WES.bwa.final.bam
<gcap> 2024-05-16 12:37:34 info [FUN]: tumor sample name: Test_T
<gcap> 2024-05-16 12:37:34 info [FUN]: normal sample name: Test_N
[1] Reading Tumor LogR data...
[1] Reading Tumor BAF data...
[1] Reading Germline LogR data...
[1] Reading Germline BAF data...
[1] Registering SNP locations...
[1] Splitting genome in distinct chunks...
<gcap> 2024-05-16 15:29:48 fatal [value[[3L]]]: job S116655 failed in ASCAT due to following error
<gcap> 2024-05-16 15:29:48 info [value[[3L]]]: length(ovl) > nrow(ASCATobj$Tumor_LogR)/10 is not TRUE
<gcap> 2024-05-16 15:29:48 info [value[[3L]]]: =====
<gcap> 2024-05-16 15:29:48 info [value[[3L]]]: Please check your input bam files (if missing bam index? if its alignment quality is lower?)
<gcap> 2024-05-16 15:29:48 info [value[[3L]]]: =====
<gcap> 2024-05-16 15:29:48 info [gcap.runASCAT]: ASCAT analysis done, check /mnt/MoCha-NGS/active/MoCha-NGS_BW_transfers/ecDNA/output for results
<gcap> 2024-05-16 15:29:48 info [gcap.workflow]: checking ASCAT result files
<gcap> 2024-05-16 15:29:48 warn [FUN]: result file /mnt/MoCha-NGS/active/MoCha-NGS_BW_transfers/ecDNA/output/S116655.ASCAT.rds does not exist, the corresponding ASCAT calling has error occurred
<gcap> 2024-05-16 15:29:48 warn [FUN]: /mnt/MoCha-NGS/active/MoCha-NGS_BW_transfers/ecDNA/output/S116655.ASCAT.rds contains a failed ASCAT job, will discard it before next step
<gcap> 2024-05-16 15:29:48 fatal [gcap.workflow]: no sucessful ASCAT result file to proceed!
<gcap> 2024-05-16 15:29:48 fatal [gcap.workflow]: check your ASCAT setting before make sure this case could not be used!
Error in gcap.workflow(tumourseqfile = "/mnt/MoCha-NGS/active/MoCha-NGS_BW_transfers/ecDNA/116655-072-R-AK7A15E12-WES.bwa.final.bam", :
In addition: Warning message:
One or more parsing issues, call `problems()` on your data frame for details,
e.g.:
dat <- vroom(...)
problems(dat)
Execution halted
> devtools::session_info()
─ Session info ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
setting value
version R version 4.2.2 (2022-10-31)
os Oracle Linux Server 8.9
system x86_64, linux-gnu
ui X11
language (EN)
collate en_US.UTF-8
ctype en_US.UTF-8
tz America/New_York
date 2024-05-16
pandoc 2.0.6 @ /usr/bin/pandoc
─ Packages ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
package * version date (UTC) lib source
ASCAT * 3.0.0 2024-05-15 [2] Github (ShixiangWang/ascat@51fd695)
Biobase * 2.58.0 2022-11-01 [2] Bioconductor
BiocGenerics * 0.44.0 2022-11-01 [2] Bioconductor
bitops 1.0-7 2021-04-24 [2] CRAN (R 4.2.2)
cachem 1.0.8 2023-05-01 [2] CRAN (R 4.2.2)
cli 3.6.2 2023-12-11 [2] CRAN (R 4.2.2)
cluster 2.1.6 2023-12-01 [2] CRAN (R 4.2.2)
codetools 0.2-20 2024-03-31 [2] CRAN (R 4.2.2)
colorspace 2.1-0 2023-01-23 [2] CRAN (R 4.2.2)
crayon 1.5.2 2022-09-29 [2] CRAN (R 4.2.2)
data.table 1.15.4 2024-03-30 [2] CRAN (R 4.2.2)
devtools 2.4.5 2022-10-11 [2] CRAN (R 4.2.2)
digest 0.6.35 2024-03-11 [2] CRAN (R 4.2.2)
doParallel * 1.0.17 2022-02-07 [2] CRAN (R 4.2.2)
dplyr 1.1.4 2023-11-17 [2] CRAN (R 4.2.2)
ellipsis 0.3.2 2021-04-29 [2] CRAN (R 4.2.2)
fansi 1.0.6 2023-12-08 [2] CRAN (R 4.2.2)
fastmap 1.2.0 2024-05-15 [2] CRAN (R 4.2.2)
foreach * 1.5.2 2022-02-02 [2] CRAN (R 4.2.2)
fs 1.6.4 2024-04-25 [2] CRAN (R 4.2.2)
furrr 0.3.1 2022-08-15 [2] CRAN (R 4.2.2)
future 1.33.2 2024-03-26 [2] CRAN (R 4.2.2)
gcap * 1.2.0 2024-05-16 [2] Github (ShixiangWang/gcap@958a135)
generics 0.1.3 2022-07-05 [2] CRAN (R 4.2.2)
GenomeInfoDb * 1.34.9 2023-02-02 [2] Bioconductor
GenomeInfoDbData 1.2.9 2024-05-15 [2] Bioconductor
GenomicRanges * 1.50.2 2022-12-16 [2] Bioconductor
GetoptLong 1.0.5 2020-12-15 [2] CRAN (R 4.2.2)
ggplot2 3.5.1 2024-04-23 [2] CRAN (R 4.2.2)
GlobalOptions 0.1.2 2020-06-10 [2] CRAN (R 4.2.2)
globals 0.16.3 2024-03-08 [2] CRAN (R 4.2.2)
glue 1.7.0 2024-01-09 [2] CRAN (R 4.2.2)
gridBase 0.4-7 2014-02-24 [2] CRAN (R 4.2.2)
gtable 0.3.5 2024-04-22 [2] CRAN (R 4.2.2)
hms 1.1.3 2023-03-21 [2] CRAN (R 4.2.2)
htmltools 0.5.8.1 2024-04-04 [2] CRAN (R 4.2.2)
htmlwidgets 1.6.4 2023-12-06 [2] CRAN (R 4.2.2)
httpuv 1.6.15 2024-03-26 [2] CRAN (R 4.2.2)
IRanges * 2.32.0 2022-11-01 [2] Bioconductor
iterators * 1.0.14 2022-02-05 [2] CRAN (R 4.2.2)
jsonlite 1.8.8 2023-12-04 [2] CRAN (R 4.2.2)
later 1.3.2 2023-12-06 [2] CRAN (R 4.2.2)
lattice 0.22-6 2024-03-20 [2] CRAN (R 4.2.2)
lgr 0.4.4 2022-09-05 [2] CRAN (R 4.2.2)
lifecycle 1.0.4 2023-11-07 [2] CRAN (R 4.2.2)
listenv 0.9.1 2024-01-29 [2] CRAN (R 4.2.2)
magrittr 2.0.3 2022-03-30 [2] CRAN (R 4.2.2)
Matrix 1.6-5 2024-01-11 [3] CRAN (R 4.2.3)
memoise 2.0.1 2021-11-26 [2] CRAN (R 4.2.2)
mime 0.12 2021-09-28 [2] CRAN (R 4.2.2)
miniUI 0.1.1.1 2018-05-18 [2] CRAN (R 4.2.2)
munsell 0.5.1 2024-04-01 [2] CRAN (R 4.2.2)
NMF 0.27 2024-02-08 [2] CRAN (R 4.2.2)
parallelly 1.37.1 2024-02-29 [2] CRAN (R 4.2.2)
pillar 1.9.0 2023-03-22 [2] CRAN (R 4.2.2)
pkgbuild 1.4.4 2024-03-17 [2] CRAN (R 4.2.2)
pkgconfig 2.0.3 2019-09-22 [2] CRAN (R 4.2.2)
pkgload 1.3.4 2024-01-16 [2] CRAN (R 4.2.2)
plyr 1.8.9 2023-10-02 [2] CRAN (R 4.2.2)
profvis 0.3.8 2023-05-02 [2] CRAN (R 4.2.2)
promises 1.3.0 2024-04-05 [2] CRAN (R 4.2.2)
purrr 1.0.2 2023-08-10 [2] CRAN (R 4.2.2)
quadprog 1.5-8 2019-11-20 [2] CRAN (R 4.2.2)
R6 2.5.1 2021-08-19 [2] CRAN (R 4.2.2)
rappdirs 0.3.3 2021-01-31 [2] CRAN (R 4.2.2)
RColorBrewer * 1.1-3 2022-04-03 [2] CRAN (R 4.2.2)
Rcpp 1.0.12 2024-01-09 [2] CRAN (R 4.2.2)
RCurl 1.98-1.14 2024-01-09 [2] CRAN (R 4.2.2)
readr * 2.1.5 2024-01-10 [2] CRAN (R 4.2.2)
registry 0.5-1 2019-03-05 [2] CRAN (R 4.2.2)
remotes 2.5.0 2024-03-17 [2] CRAN (R 4.2.2)
reshape2 1.4.4 2020-04-09 [2] CRAN (R 4.2.2)
rjson 0.2.21 2022-01-09 [2] CRAN (R 4.2.2)
rlang 1.1.3 2024-01-10 [2] CRAN (R 4.2.2)
rngtools 1.5.2 2021-09-20 [2] CRAN (R 4.2.2)
S4Vectors * 0.36.2 2023-02-26 [2] Bioconductor
scales 1.3.0 2023-11-28 [2] CRAN (R 4.2.2)
sessioninfo 1.2.2 2021-12-06 [2] CRAN (R 4.2.2)
shiny 1.8.1.1 2024-04-02 [2] CRAN (R 4.2.2)
sigminer * 2.3.1 2024-05-11 [2] CRAN (R 4.2.2)
stringi 1.8.4 2024-05-06 [2] CRAN (R 4.2.2)
stringr 1.5.1 2023-11-14 [2] CRAN (R 4.2.2)
tibble 3.2.1 2023-03-20 [2] CRAN (R 4.2.2)
tidyselect 1.2.1 2024-03-11 [2] CRAN (R 4.2.2)
tzdb 0.4.0 2023-05-12 [2] CRAN (R 4.2.2)
urlchecker 1.0.1 2021-11-30 [2] CRAN (R 4.2.2)
usethis 2.2.3 2024-02-19 [2] CRAN (R 4.2.2)
utf8 1.2.4 2023-10-22 [2] CRAN (R 4.2.2)
uuid 1.2-0 2024-01-14 [2] CRAN (R 4.2.2)
vctrs 0.6.5 2023-12-01 [2] CRAN (R 4.2.2)
xgboost 1.7.7.1 2024-01-25 [2] CRAN (R 4.2.2)
xtable 1.8-4 2019-04-21 [2] CRAN (R 4.2.2)
XVector 0.38.0 2022-11-01 [2] Bioconductor
zlibbioc 1.44.0 2022-11-01 [2] Bioconductor
@tingchiafelix I can reproduce the error with the updated reference files. You used uncorrect reference GC and RT files for hg19, as the file with "update" was used for working with updating ASCAT version, but not the fixed version.
You should use the following two files, I tested them and they are still working with your current run environment.
Please note that I removed the correction files marked with 'update', which does not work for the latest version of ASCAT v3 any more, which should use https://github.com/VanLoo-lab/ascat/tree/master/ReferenceFiles/WES instead.
Hi Shixiang, I've been running the GCAP for over a hundred samples. I could successfully get the results. However, I found a couple of samples that could not go through the workflow and it seems like there was no "ASCAT result file" generated in step 1 (please see below). I'm including the output folder in the attachment and BAM files. I hope it will help with the investigation. I would appreciate any suggestions.
Best, TC
Working on 119177~322-R1~PXWJH4KA3~WES
Loading required package: ASCAT
Loading required package: RColorBrewer
Loading required package: splines
Loading required package: readr
Loading required package: GenomicRanges
Loading required package: stats4
Loading required package: BiocGenerics
Attaching package: ‘BiocGenerics’
The following objects are masked from ‘package:stats’:
IQR, mad, sd, var, xtabs
The following objects are masked from ‘package:base’:
anyDuplicated, aperm, append, as.data.frame, basename, cbind,
colnames, dirname, do.call, duplicated, eval, evalq, Filter, Find,
get, grep, grepl, intersect, is.unsorted, lapply, Map, mapply,
match, mget, order, paste, pmax, pmax.int, pmin, pmin.int,
Position, rank, rbind, Reduce, rownames, sapply, setdiff, sort,
table, tapply, union, unique, unsplit, which.max, which.min
Loading required package: S4Vectors
Attaching package: ‘S4Vectors’
The following objects are masked from ‘package:base’:
expand.grid, I, unname
Loading required package: IRanges
Loading required package: GenomeInfoDb
Loading required package: parallel
Loading required package: doParallel
Loading required package: foreach
Loading required package: iterators
Loading required package: sigminer
Registered S3 method overwritten by 'sigminer':
method from
print.bytes Rcpp
sigminer version 2.3.1
- Star me at https://github.com/ShixiangWang/sigminer
- Run hello() to see usage and citation.
gcap version 1.2.0
- Project URL at https://github.com/ShixiangWang/gcap
Citation:
Wang, S., Wu, CY., He, MM. et al. Machine learning-based extrachromosomal DNA identification in
large-scale cohorts reveals its clinical implications in cancer. Nat Commun 15, 1515 (2024). https://doi.org/10.1038/s41467-024-45479-6
[1] "119177~322-R1~PXWJH4KA3~WES" "119177~322-R1~J1-CAF~WES"
chr [1:2] "119177~322-R1~PXWJH4KA3~WES" "119177~322-R1~J1-CAF~WES"
NULL
[1] "119177~322-R1~PXWJH4KA3~WES"
[1] "119177~322-R1~J1-CAF~WES"
[1] "119177"
<gcap> 2024-07-09 08:18:19 info [gcap.workflow]: =====================
<gcap> 2024-07-09 08:18:19 info [gcap.workflow]: GCAP WORKFLOW
<gcap> 2024-07-09 08:18:19 info [gcap.workflow]: =====================
<gcap> 2024-07-09 08:18:19 info [gcap.workflow]:
<gcap> 2024-07-09 08:18:19 info [gcap.workflow]: =====================
<gcap> 2024-07-09 08:18:19 info [gcap.workflow]: Step 1: Run ASCAT 3.0
<gcap> 2024-07-09 08:18:19 info [gcap.workflow]: =====================
<gcap> 2024-07-09 08:18:19 info [gcap.runASCAT]: > Run ASCAT on WES data <
<gcap> 2024-07-09 08:18:19 info [gcap.runASCAT]:
<gcap> 2024-07-09 08:18:19 info [gcap.runASCAT]: Configs:
<gcap> 2024-07-09 08:18:19 info [gcap.runASCAT]: result path set to /mnt/MoCha-NGS/active/MoCha-NGS_BW_transfers/ecDNA/output/119177~322-R1~PXWJH4KA3~WES
<gcap> 2024-07-09 08:18:19 info [gcap.runASCAT]: allelecounter_exe set to ~/miniconda3/envs/ecDNA/bin/alleleCounter
<gcap> 2024-07-09 08:18:19 info [gcap.runASCAT]: g1000allelesprefix set to /mnt/MoCha-NGS/active/MoCha-NGS_BW_transfers/ecDNA/ref/1000G_loci_hg19//1000genomesAlleles2012_chr
<gcap> 2024-07-09 08:18:19 info [gcap.runASCAT]: g1000lociprefix set to /mnt/MoCha-NGS/active/MoCha-NGS_BW_transfers/ecDNA/ref/1000G_loci_hg19//1000genomesloci2012chrstring_chr
<gcap> 2024-07-09 08:18:19 info [gcap.runASCAT]: GCcontentfile set to /mnt/MoCha-NGS/active/MoCha-NGS_BW_transfers/ecDNA/ref/GC_correction_hg19.txt
<gcap> 2024-07-09 08:18:19 info [gcap.runASCAT]: replictimingfile set to /mnt/MoCha-NGS/active/MoCha-NGS_BW_transfers/ecDNA/ref/RT_correction_hg19.txt
<gcap> 2024-07-09 08:18:19 info [gcap.runASCAT]: nthreads set to 22
<gcap> 2024-07-09 08:18:19 info [gcap.runASCAT]: minCounts set to 10
<gcap> 2024-07-09 08:18:19 info [gcap.runASCAT]: BED_file set to NA
<gcap> 2024-07-09 08:18:19 info [gcap.runASCAT]: probloci_file set to NA
<gcap> 2024-07-09 08:18:19 info [gcap.runASCAT]: chrom_names set to <1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22>
<gcap> 2024-07-09 08:18:19 info [gcap.runASCAT]: gender set to <XX>
<gcap> 2024-07-09 08:18:19 info [gcap.runASCAT]: min_base_qual set to 20
<gcap> 2024-07-09 08:18:19 info [gcap.runASCAT]: min_map_qual set to 35
<gcap> 2024-07-09 08:18:19 info [gcap.runASCAT]: penalty set to 70
<gcap> 2024-07-09 08:18:19 info [gcap.runASCAT]: skip_finished_ASCAT set to TRUE
<gcap> 2024-07-09 08:18:19 info [gcap.runASCAT]: 1 jobs detected
<gcap> 2024-07-09 08:18:19 info [gcap.runASCAT]: No ASCAT job to skip.
<gcap> 2024-07-09 08:18:19 info [FUN]: start submitting job 119177-322-R1-PXWJH4KA3-WES
<gcap> 2024-07-09 08:18:19 info [FUN]: tumor data file: /mnt/legacy/MoCha-hiseq/legacy/scratch/BW_transfers/processedDATA/119177/20170910/119177~322-R1~PXWJH4KA3~WES/119177~322-R1~PXWJH4KA3~WES.bwa.final.bam
<gcap> 2024-07-09 08:18:19 info [FUN]: normal data file: /mnt/legacy/MoCha-hiseq/legacy/scratch/BW_transfers/processedDATA/119177/20170910/119177~322-R1~J1-CAF~WES/119177~322-R1~J1-CAF~WES.bwa.final.bam
<gcap> 2024-07-09 08:18:19 info [FUN]: tumor sample name: 119177-322-R1-PXWJH4KA3-WES
<gcap> 2024-07-09 08:18:19 info [FUN]: normal sample name: 119177-322-R1-J1-CAF-WES
[1] Reading Tumor LogR data...
[1] Reading Tumor BAF data...
[1] Reading Germline LogR data...
[1] Reading Germline BAF data...
[1] Registering SNP locations...
[1] Splitting genome in distinct chunks...
[1] Sample 119177-322-R1-PXWJH4KA3-WES (1/1)
GC correlation: 25bp 0.061 ; 50bp 0.079 ; 100bp 0.111 ; 200bp 0.161 ; 500bp 0.251 ; 1kb 0.259 ; 2kb 0.244 ; 5kb 0.220 ; 10kb 0.203 ; 20kb 0.189 ; 50kb 0.172 ; 100kb 0.164 ; 200kb 0.158 ; 500kb 0.153 ; 1Mb 0.147 ; 2Mb 0.139 ; 5Mb 0.119 ; 10Mb 0.099 ;
Short window size: 1kb
Long window size: 5kb
Replication timing correlation: Bg02es 0.095 ; Bj 0.106 ; Gm06990 0.126 ; Gm12801 0.132 ; Gm12812 0.126 ; Gm12813 0.130 ; Gm12878 0.128 ; Helas3 0.117 ; Hepg2 0.134 ; Huvec 0.123 ; Imr90 0.109 ; K562 0.133 ; Mcf7 0.132 ; Nhek 0.131 ; Sknsh 0.136 ;
Replication dataset: Sknsh
[1] Plotting tumor data
[1] Plotting germline data
[1] Sample 119177-322-R1-PXWJH4KA3-WES (1/1)
[1] Sample 119177-322-R1-PXWJH4KA3-WES (1/1)
<gcap> 2024-07-09 11:38:27 info [doTryCatch]: job 119177-322-R1-PXWJH4KA3-WES done
<gcap> 2024-07-09 11:38:27 info [gcap.runASCAT]: ASCAT analysis done, check /mnt/MoCha-NGS/active/MoCha-NGS_BW_transfers/ecDNA/output/119177~322-R1~PXWJH4KA3~WES for results
<gcap> 2024-07-09 11:38:27 info [gcap.workflow]: checking ASCAT result files
<gcap> 2024-07-09 11:38:27 warn [FUN]: /mnt/MoCha-NGS/active/MoCha-NGS_BW_transfers/ecDNA/output/119177~322-R1~PXWJH4KA3~WES/119177-322-R1-PXWJH4KA3-WES.ASCAT.rds contains a failed ASCAT job, will discard it before next step
<gcap> 2024-07-09 11:38:27 fatal [gcap.workflow]: no sucessful ASCAT result file to proceed!
<gcap> 2024-07-09 11:38:27 fatal [gcap.workflow]: check your ASCAT setting before make sure this case could not be used!
Error in gcap.workflow(tumourseqfile = tumor_bam_path, normalseqfile = normal_bam, :
In addition: There were 50 or more warnings (use warnings() to see the first 50)
Execution halted
Hi @tingchiafelix, it's normal. As ASCAT cannot 100% generate corresponding results. Not an issue of GCAP.
Hi Shixiang,
Does it mean ASCAT could occasionally have failures when running through the GCAP workflow? What is the potential failure rate in your cohort?
Also, I'm wondering if you have a way to estimate the absolute copy of ecDNA similar to what we usually have in non-circular DNA amplification?
Best, TC
In my experience, there is average 1-5% failure calling of ASCAT, similar to tool like FACETS. Sequenza is much better on this.
For estimating the absolute copy of ecDNA, WGS or experimental strategies are recommended, as WES cannot provide sufficient information for a structure of an ecDNA, in my view. However, it's really a good and important question, I was wondering if I could modeling the expected gene copy number of the non-circular amplification, so copy number on ecDNA of a gene could be the residule of inferred total copy number and the modeling non-circular amplification copy number. In currently stage, GCAP only report total copy number ASCAT captured and the ecDNA prob/class inferred by model.
Thank you for this insight and information.
Best, TC
Hi,
I'm testing one of our samples. I keep receiving this BAM file input error. Could you please provide insight on this? Or anything I may miss for running the workflow?
Best, TC