Hi,
I am performing PureCn as an entry-level bioinformatician while matched normal is not available to estimate tumour purity.
I have my own tumour and normal samples bam and vcf files. Also my fatsa format hg19 as a reference genome.
I know it takes your time, But I would be appreciative if you could support and guide me.
I was wondering if,
1.We need generating interval file from baits BED files while our vcf files and bam files are ready?
2.we should focus on Run PureCN with third-party segmentation and ignore Run with internal segmentation since we have prepared our files through GATK?
If we need generating interval file from a BED, I got this conclusion that we need these files. Could you please do me a favor and confirm I am on a right track?
1.baits_hg19.bed (BED file containing bait coordinates for hg19 specific to our capture kit)). How can I access this file? Download or prepare it from our own fasta or bam files?
2.hg19.fa in fasta format .
(Should I use the fasta file that I have or download from this package?)
"The --genome version is needed to annotate exons with gene symbols. Use hg19/hg38 for human genomes, not b37/b38. You might get a warning that an annotation package is missing. For hg19, install TxDb.Hsapiens.UCSC.hg19.knownGene in R."
3.Mappability File: provides mappability scores for 100-mers.
Download at
wgEncodeCrgMapabilityAlign100mer.bigWig from the UCSC Genome Browser.
4.Replication timing file: download at
wgEncodeUwRepliSeqK562WaveSignalRep1.bigWig from the UCSC Genome website.
Hi, I am performing PureCn as an entry-level bioinformatician while matched normal is not available to estimate tumour purity.
I have my own tumour and normal samples bam and vcf files. Also my fatsa format hg19 as a reference genome.
I know it takes your time, But I would be appreciative if you could support and guide me.
I was wondering if,
1.We need generating interval file from baits BED files while our vcf files and bam files are ready?
2.we should focus on Run PureCN with third-party segmentation and ignore Run with internal segmentation since we have prepared our files through GATK?
If we need generating interval file from a BED, I got this conclusion that we need these files. Could you please do me a favor and confirm I am on a right track?
1.baits_hg19.bed (BED file containing bait coordinates for hg19 specific to our capture kit)). How can I access this file? Download or prepare it from our own fasta or bam files?
2.hg19.fa in fasta format . (Should I use the fasta file that I have or download from this package?)
"The --genome version is needed to annotate exons with gene symbols. Use hg19/hg38 for human genomes, not b37/b38. You might get a warning that an annotation package is missing. For hg19, install TxDb.Hsapiens.UCSC.hg19.knownGene in R."
3.Mappability File: provides mappability scores for 100-mers. Download at wgEncodeCrgMapabilityAlign100mer.bigWig from the UCSC Genome Browser.
4.Replication timing file: download at wgEncodeUwRepliSeqK562WaveSignalRep1.bigWig from the UCSC Genome website.
Thank you for your support