BoevaLab / FREEC

Control-FREEC: Copy number and genotype annotation in whole genome and whole exome sequencing data
147 stars 49 forks source link

NCBI has different annotation #108

Closed RochaLAJ closed 2 years ago

RochaLAJ commented 2 years ago

I had success on my attempt to generate a simple CNV analysis. However, i've been trying to complement the analysis with BAF. Issues: My fna.fai has different positional annotation than expected by Control-FREEC, generating an empty pileup:

..File /mnt/g/GCF_000001405.40_GRCh38.p14_genomic.fna.fai was read to create a miniPileup
terminate called after throwing an instance of 'std::length_error'
  what():  basic_string::_M_replace
Aborted
[mpileup] 1 samples in 1 input files
..If you have got an error at this step and a mini-pileup file is empty, check that you are using samtools v1.1 or later and provide a corresponding path in your config file
... -> Done!
..will use SNP positions from /mnt/g/FREEC-11.6/files/00-common_all.vcf.gz to calculate BAF profiles
..Starting reading /mnt/g/FREEC-11.6/files/00-common_all.vcf.gz to get SNP positions
..read 34404592 SNP positions
PROFILING [tid=140286942336832]: /mnt/g/FREEC-11.6/files/00-common_all.vcf.gz read in 68 seconds [readSNPs]
..use "pileup" format of reads to calculate BAF profiles
..Starting reading ./BAM_S1_marked_duplicates_aligned.bam_minipileup.pileup to calculate BAF profiles
0 lines read
...
..Adding BAF info to the Sample dataset
An error occurred in GenomeCopyNumber::addBAFinfo: could not find an SNP index for NC_000001.11

How the fna.fai looks like

NC_000001.11    248956422   69  80  81
NT_187361.1 175055  252068568   80  81
NT_187362.1 32032   252245933   80  81
NT_187363.1 127682  252278487   80  81
NT_187364.1 66860   252407887   80  81
NT_187365.1 40176   252475704   80  81
NT_187366.1 42210   252516504   80  81
NT_187367.1 176043  252559363   80  81
NT_187368.1 40745   252737728   80  81
...

Generated with samtools index from the same fastq that i used to align the sample. I'm looking more for a suggestion than reporting an issue itself. The software works as it should.

valeu commented 2 years ago

Hello, FREEC calls bedtools to get minipileup. And Bedtools will only work if chromosome names are the same in all files.. So I guess to make it run you need to "manually" change chr to NT_ in some of your files. Or vice versa.

RochaLAJ commented 2 years ago

Hello, FREEC calls bedtools to get minipileup. And Bedtools will only work if chromosome names are the same in all files.. So I guess to make it run you need to "manually" change chr to NT_ in some of your files. Or vice versa.

Hi This way seems like the only one. I'll work on some scripts to automate the process. Thank you for your time!