VanLoo-lab / ascat

ASCAT R package
https://www.mdanderson.org/research/departments-labs-institutes/labs/van-loo-laboratory/resources.html#ASCAT
164 stars 85 forks source link

Is ascat deterministic on multi-threaded systems? #107

Closed sclan closed 2 years ago

sclan commented 2 years ago

The command I used: docker run quay.io/wtsicgp/ascatngs:4.5.0 ascat.pl -r ./GRCh38.d1.vd1.fa -t ./tumor.bam -n ./normal.bam -sg ./SnpGcCorrections.tsv -pr WGS -g XX -gc Y -rs Human -pl ILLUMINA -ra GRCh38

With the same inputs I ran the analysis on two different EC2 instances with the same hardware spec (16 cpu thread) and same linux system.

There are slight differences in the output. For example:

Sample summary statistics from 1st try:

NormalContamination 0 Ploidy 3.0998005300981 rho 1 psi 2.85 goodnessOfFit 95.1865484065418 GenderChr Y GenderChrFound N 2nd try:

NormalContamination 0 Ploidy 3.09828619832227 rho 1 psi 2.85 goodnessOfFit 95.055397549694 GenderChr Y GenderChrFound N

I was wondering if anyone encountered this observation. The process looks to be single threaded from htop monitor during the runtime.

tlesluyes commented 2 years ago

Hi @sclan,

As you noticed in a comment that has been removed, you are using ascatNGS which is different from ASCAT. Any question/issue related to ascatNGS should be asked to their dev team through the appropriate GitHub repo. They could use some seeds to mirror BAF when processing HTS data, that's also something we do in ASCAT (ascat.getBAFsAndLogRs) but I can't answer for sure so it needs to be checked with them as we are not managing such a repo.

As for ASCAT, I've used the code below to process S1 (from the ExampleData folder) and got the same results.

suppressPackageStartupMessages(library(ASCAT))
suppressPackageStartupMessages(library(foreach))
suppressPackageStartupMessages(library(doParallel))
registerDoParallel(cores=4)

OUT=foreach(x=1:12) %dopar% {
  print(x)
  ascat.bc=ascat.loadData("Tumor_LogR.txt",
                          "Tumor_BAF.txt",
                          "Germline_LogR.txt",
                          "Germline_BAF.txt",
                          gender='XX',
                          genomeVersion="hg19")
  ascat.bc=ascat.correctLogR(ascat.bc,GCcontentfile="GC_example.txt",replictimingfile="RT_example.txt")
  ascat.bc=ascat.aspcf(ascat.bc,out.dir=NA)
  ascat.output=ascat.runAscat(ascat.bc)
  unlink(c('S1.ASCATprofile.png','S1.rawprofile.png','S1.sunrise.png'))
  return(ascat.output)
}

table(sapply(OUT,function(x) x$ploidy))

1.8480194017785 12

table(sapply(OUT,function(x) x$purity))

0.54 12

table(sapply(OUT,function(x) x$goodnessOfFit))

97.2050481929797 12

Cheers,

Tom.