markhilt / ARBitR

ARBitR: Assembly Refinement with Barcode-identity-tagged Reads
Other
9 stars 1 forks source link

A question regarding "ValueError: failure when retrieving sequence on" #6

Open Veldem opened 3 years ago

Veldem commented 3 years ago

Dear all,

When I run the ARBitR, I am getting an error and now, I could not solve the problem yet. After generating sorted ".bam" and ".bai" files (genome.nextpolish.sorted.bam and genome.nextpolish.sorted.bam.bai), I run following script;

/okyanus/users/veldem/01.Direct_Projects/06.Anchovy_Genome_Projects/04.Genome_Scaffolding/02.arbitR_scaffolding/ARBitR-master/src/arbitr.py -i /okyanus/users/veldem/01.Direct_Projects/06.Anchovy_Genome_Projects/03.Genome_Polishing/01.NextPolish/NextPolish/genome.nextpolish.fa genome.nextpolish.sorted.bam

and getting error (after it take nearly one hour);

[Sun Apr 25 19:24:54 2021] Collecting contigs. [Sun Apr 25 19:24:54 2021] Collecting barcodes for linkgraph. [Sun Apr 25 19:24:54 2021] Starting barcode collection. Found 3949 contigs. [Sun Apr 25 19:30:00 2021] [ BARCODE COLLECTION ] Completed: 100.0% (7898 out of 7898) [Sun Apr 25 19:30:00 2021] Creating link graph. [Sun Apr 25 19:55:45 2021] [ BARCODE COMPARISON ] Completed: 100.0% (7707 out of 7707) [Sun Apr 25 19:55:45 2021] Number of windows: 7707 [Sun Apr 25 19:56:52 2021] [ BARCODE LINKING ] Completed: 100.0% (7707 out of 7707) [Sun Apr 25 19:56:54 2021] Writing link graph to genome.nextpolish.sorted.ARBitR.backbone.gfa. [Sun Apr 25 19:56:54 2021] Finding paths. [Sun Apr 25 19:57:05 2021] Found 741 paths. [Sun Apr 25 19:57:05 2021] Collecting barcodes from short contigs. [Sun Apr 25 19:57:05 2021] Starting barcode collection. Found 7568 contigs. [Sun Apr 25 20:00:32 2021] [ BARCODE COLLECTION ] Completed: 100.0% (7568 out of 7568) [Sun Apr 25 20:20:17 2021] [ PATH FILLING ] Completed: 100.0% (741 out of 741) [Sun Apr 25 20:20:17 2021] Found fasta file for merging: /okyanus/users/veldem/01.Direct_Projects/06.Anchovy_Genome_Projects/03.Genome_Polishing/01.NextPolish/NextPolish/genome.nextpolish.fa [Sun Apr 25 20:20:17 2021] Trimming contig ends... [E::fai_retrieve] Failed to retrieve block: unexpected end of file(741 out of 741) [Sun Apr 25 20:22:27 2021] [ TRIMMING ] Completed: 100.0% (741 out of 741) Traceback (most recent call last): File "/okyanus/users/veldem/01.Direct_Projects/06.Anchovy_Genome_Projects/04.Genome_Scaffolding/02.arbitR_scaffolding/ARBitR-master/src/arbitr.py", line 250, in main() File "/okyanus/users/veldem/01.Direct_Projects/06.Anchovy_Genome_Projects/04.Genome_Scaffolding/02.arbitR_scaffolding/ARBitR-master/src/arbitr.py", line 228, in main bed = merge_fasta.main( args.input_fasta, \ File "/okyanus/users/veldem/01.Direct_Projects/06.Anchovy_Genome_Projects/04.Genome_Scaffolding/02.arbitR_scaffolding/ARBitR-master/src/merge_fasta.py", line 1011, in main trimmed_fasta[tig] = fastafile.fetch(reference=tig, \ File "pysam/libcfaidx.pyx", line 319, in pysam.libcfaidx.FastaFile.fetch ValueError: failure when retrieving sequence on 'ctg27_np12'

My python version: Python 3.8.8

Best wishes

markhilt commented 3 years ago

Hi, sorry about the delay and thanks for posting.

This error would arise if the fai doesn't match the fasta for some reason, e.g. that the fasta was changed after indexing. In that case, simply re-indexing the fasta would solve the issue. Otherwise it may be a bug - thanks for reporting and hopefully I can push a fix soon.