cancerit / ascatNgs

Somatic copy number analysis using WGS paired end wholegenome sequencing
http://cancerit.github.io/ascatNgs/
GNU Affero General Public License v3.0
68 stars 17 forks source link

getting unplaced contigs and decoy sequences in the copy number result #109

Closed anoronh4 closed 2 years ago

anoronh4 commented 2 years ago

As a test we ran Ascat in 17 samples and for all we saw that the *copynumber.caveman.csv contained results pertaining to 23-24 chromosomes -- except one which had 86 chromosomes, including for example hs37d5 (decoy), GL000244.1 (unplaced contig), NC_007605 (viral sequence). the *.copynumber.caveman.vcf.gz for this single sample was about 100,000X larger than the others, and the length of lines was extremely long by comparison. the file was >700Mb, whereas the other 16 are ~5Kb. this seems a bit strange and possibly non-biological. we were able to reproduce it and we are confident that all samples were run exactly the same way.

Also: For this unusual sample, the last 4 columns of *copynumber.caveman.csv are 2,1,5,2 for all 86 rows -- so 2,1 copy number in the normal and 5,2 copy number for all chromosomes in the tumor. it is really surprising that the copy number would be so uniform for the tumor. Also the start position of each entry is always 1 for the sample, and not 1 in any other sample. attaching the file so you can see. sample.copynumber.caveman.csv

anoronh4 commented 2 years ago

Update: we found that in the summary file it says: ## WARNING ASCAT failed to generate a solution ## -- so it seems that ASCAT produced a "fake" result because a solution was not found. Are there any tweaks or workarounds to ensure that a solution is found?

keiranmraine commented 2 years ago

If ascat fails to generate a solution in most cases you are unable to easily recover the data. It may be possible if you can provide sensible estimates of purity and ploidy when you execute the analysis. Be aware that ideally ascat needs manual review to verify the solution for best results.