polyactis / Accucopy

Accucopy is a computational method that infers Allele-Specific Copy Number alterations from low-coverage low-purity tumor sequencing data.
https://www.yfish.org/software/Accucopy
GNU General Public License v3.0
15 stars 4 forks source link

unmatched chromosome name get error #5

Open worker000000 opened 3 years ago

worker000000 commented 3 years ago

Dear professor, thansk for such a accurate software.

when I am using it, it raise errors like follows. MAYbe caused by chrM and chrMT.

what is more, since your genome.dict has many patch chromosome names, like

SN:GL000207.1 SN:GL000226.1 SN:GL000229.1 SN:GL000231.1 , but when users input bam, they may use a different version of genome, even for hg19, the patch chromosome seems to be different, so why not the input is a fastq, but a bam? ,can you give me some suggestions

image

image

ERRORS screenshot like following image

worker000000 commented 3 years ago

abother question is we know wgs always give big fragments of cnv. so why here the configure file, the window size is 500, people seems to use 1M instead of 500bp

fanxinping commented 3 years ago

We recommend remap your samples to the ref genome provided by us to avoid some unexpected behaviour. Or, you can generate your own ref data according to https://www.yfish.org/display/PUB/Accucopy#Accucopy-3.7Makeyourownreferencegenomepackage

polyactis commented 3 years ago

You probably need to watch some videos or read some reviews/tutorials to understand how DNA is extracted from a cell, fragmented, and PCRed before it is put on a DNA sequencing machine. 500bp is NOT the CNA length. It is the average length of DNA fragments to be sequenced by a high-throughput DNA sequencer, i.e. Illumina HiSeq or NovaSeq.

These so-called next-gen sequencers can only sequence 100-150bp for one fragment, not from start to end of a chromosome. Anyhow, you need to get familiar with what a next-gen sequencer can and cannot do.

abother question is we know wgs always give big fragments of cnv. so why here the configure file, the window size is 500, people seems to use 1M instead of 500bp

worker000000 commented 3 years ago

We recommend remap your samples to the ref genome provided by us to avoid some unexpected behaviour. Or, you can generate your own ref data according to https://www.yfish.org/display/PUB/Accucopy#Accucopy-3.7Makeyourownreferencegenomepackage

thanks a lot, so can this tool accept fastq file instead of bam?

worker000000 commented 3 years ago

You probably need to watch some videos or read some reviews/tutorials to understand how DNA is extracted from a cell, fragmented, and PCRed before it is put on a DNA sequencing machine. 500bp is NOT the CNA length. It is the average length of DNA fragments to be sequenced by a high-throughput DNA sequencer, i.e. Illumina HiSeq or NovaSeq.

These so-called next-gen sequencers can only sequence 100-150bp for one fragment, not from start to end of a chromosome. Anyhow, you need to get familiar with what a next-gen sequencer can and cannot do.

abother question is we know wgs always give big fragments of cnv. so why here the configure file, the window size is 500, people seems to use 1M instead of 500bp

thanks a lot. so how to understand here is 500 for segmentation window_size the window size in base pair for segmentation. The segmentation program (GADA) first calculates the number of reads for each window and then perform segmentation over the genome. A small window size often leads to a large number of small segments. The recommended window size is 500bp.

fanxinping commented 3 years ago

We recommend remap your samples to the ref genome provided by us to avoid some unexpected behaviour. Or, you can generate your own ref data according to https://www.yfish.org/display/PUB/Accucopy#Accucopy-3.7Makeyourownreferencegenomepackage

thanks a lot, so can this tool accept fastq file instead of bam?

No, Accucopy accepts bam file only.

You probably need to watch some videos or read some reviews/tutorials to understand how DNA is extracted from a cell, fragmented, and PCRed before it is put on a DNA sequencing machine. 500bp is NOT the CNA length. It is the average length of DNA fragments to be sequenced by a high-throughput DNA sequencer, i.e. Illumina HiSeq or NovaSeq. These so-called next-gen sequencers can only sequence 100-150bp for one fragment, not from start to end of a chromosome. Anyhow, you need to get familiar with what a next-gen sequencer can and cannot do.

abother question is we know wgs always give big fragments of cnv. so why here the configure file, the window size is 500, people seems to use 1M instead of 500bp

thanks a lot. so how to understand here is 500 for segmentation window_size the window size in base pair for segmentation. The segmentation program (GADA) first calculates the number of reads for each window and then perform segmentation over the genome. A small window size often leads to a large number of small segments. The recommended window size is 500bp.

500bp for segmentation is just a proper parameter base on our testing and you can set other value.