Open bioinfogit opened 1 year ago
Hi @bioinfogit I'm not able to reproduce this error. Could you delete that tmp folder and try again?
Hello, the version is 2.2.3 docker image quay.io/pacbio/paraphase:2.2.3_build2
I received an error message when I executed the command in the Docker environment. However, after checking the outdir, I could not locate the tmp file.
root@a9f9e6fb2c28:/# paraphase --threads 8 --bam /longread/NA17282.HomoSapiens.aligned.haplotagged.bam -o /longread/ --reference /genomes/Homo_sapiens.GRCh38.dna.primary_assembly.fa
ERROR:root:Error running the program...See error message below
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/paraphase/paraphase.py", line 472, in run
configs = self.update_config(gene_list, tmpdir, args.reference)
File "/usr/local/lib/python3.8/dist-packages/paraphase/paraphase.py", line 325, in update_config
self.make_ref_fasta(ref_file, realign_region, genome)
File "/usr/local/lib/python3.8/dist-packages/paraphase/paraphase.py", line 352, in make_ref_fasta
pysam.faidx(ref_file)
File "/usr/local/lib/python3.8/dist-packages/pysam/utils.py", line 83, in __call__
raise SamtoolsError(
pysam.utils.SamtoolsError: 'samtools returned with error 1: stdout=, stderr=[faidx] Could not build fai index /longread/tmp_2023-11-22-12-15-40-154747/smn1_ref.fa.fai\n'
INFO:root:Completed Paraphase analysis at 2023-11-22 12:15:40.277212...
I now understand the issue. My ensembl reference file lacks "chr" string. Could you fix the problem here?
Error
kaan@biyoinfo1:~$ samtools faidx levopt/hg38/genomes/ensembl_p13_primary/Homo_sapiens.GRCh38.dna.primary_assembly.fa chr5:70890000-71100000 | sed -e "s/-/_/" | sed -e "s/:/_/" > kaan.txt
[W::fai_get_val] Reference chr5:70890000-71100000 not found in FASTA file, returning empty sequence
[faidx] Failed to fetch sequence in chr5:70890000-71100000
kaan@biyoinfo1:~$ samtools faidx kaan.txt
[faidx] Could not build fai index kaan.txt.fai
No Error
kaan@biyoinfo1:~$ samtools faidx levopt/hg38/genomes/ensembl_p13_primary/Homo_sapiens.GRCh38.dna.primary_assembly.fa 5:70890000-71100000 | sed -e "s/-/_/" | sed -e "s/:/_/" > kaan.txt
kaan@biyoinfo1:~$ samtools faidx kaan.txt
Hi @themkdemiiir, Paraphase assumes GRCh38 has "chr" in chromosome names. Could you realign to the UCSC/NCBI version and rerun Paraphase? For best performance with HiFi data, please remove ALT contigs from the reference genome before alignment. We do have a recommended version of reference genome (with download links) documented here.
Hi I am getting following error pysam.utils.SamtoolsError: 'samtools returned with error 1: stdout=, stderr=[faidx] Could not build fai index ../path/tmp/f8_ref.fa.fai\n' removing f8 from gene list works and I am using the latest version paraphase --version 2.2.3