Hi,
I am now trying to run the first script to extract variant candidates, but I cannot get what do you mean by the chromosome start and end. I am using fasta file for chromosome 22 as a ref and bam file for the same chr also but when I tried to write down the chr start as 1 and end as wc -c of the same ref file, it gave me an invalid region or unknown reference. So how should it go from there ? that's the code that I ran:pypy3 ExtractVariantCandidate.py --fin_bam /home/ionadmin/bassyouni/source_bam/NA12877_chr22.bam --fin_ref /home/ionadmin/bassyouni/chr22.fa --chrName 22 --chrStart 1 --chrEnd 50818468 --fout_can /home/ionadmin/bassyouni/analysis/Psi-Caller_workplace/var.canthat was the error:[W::fai_get_val] Reference 22:1-50818468 not found in FASTA file, returning empty sequence [faidx] Failed to fetch sequence in 22:1-50818468 [main_samview] region "22:1-50818468" specifies an invalid region or unknown reference. Continue anyway.this is the output of the fasta index file:chr22 50818468 7 50 51
I found the problem here in this issue, it's all related to the --chrName flag as for some files the name has to be written with the chr22 instead of the 22 only. it worked fine with that!
Hi, I am now trying to run the first script to extract variant candidates, but I cannot get what do you mean by the chromosome start and end. I am using fasta file for chromosome 22 as a ref and bam file for the same chr also but when I tried to write down the chr start as 1 and end as wc -c of the same ref file, it gave me an invalid region or unknown reference. So how should it go from there ?
that's the code that I ran:
pypy3 ExtractVariantCandidate.py --fin_bam /home/ionadmin/bassyouni/source_bam/NA12877_chr22.bam --fin_ref /home/ionadmin/bassyouni/chr22.fa --chrName 22 --chrStart 1 --chrEnd 50818468 --fout_can /home/ionadmin/bassyouni/analysis/Psi-Caller_workplace/var.can
that was the error:[W::fai_get_val] Reference 22:1-50818468 not found in FASTA file, returning empty sequence [faidx] Failed to fetch sequence in 22:1-50818468 [main_samview] region "22:1-50818468" specifies an invalid region or unknown reference. Continue anyway.
this is the output of the fasta index file:chr22 50818468 7 50 51