Closed zhoudreames closed 1 year ago
@mrvollger @wharvey31 I am so sorry to disturb you, I posted too many questions. Would it be convenient for you to answer? Thank you~
Copy number can often vary over the gene model.
I would visualize the results against the gene model and try to select the subregion that you believe to be representative for that gene. Here is a link to a representative example for a human gene where I highlight the region I would select: http://genome.ucsc.edu/s/mrvollger/wssd-example
If you cannot do that for some reason I would default to the median.
Following your pipeline, I want to estimate the copy number of CKD20 genes in chr14 for no-human species, and show you some steps in below
After the task ran successfully, I got many result files and I try to guess the final result of the pipeline, which may be results/test-reference/tracks/bed9/wssd/chr14_chrX.bed.gz file. Then, I got genome location information for 10 copies of the CDK20 gene :
run the bedtools intersect -a chr14_chrX.bed.gz -b CKD20.gene.bed -wa >CDK20.CN.bed:
The copy number of the CDK20 gene varies greatly(9~166 copies), but most are around 20 copies, which is consistent with the actual copy of the genome.
I want to know how to obtain the final copy number this CDK20 gene , such as using the average or median of the intersection results. in addition, Is there any problem with the results file I use or the method for evaluating gene copy numbers?