LaurieLecomte / SVs_long_reads

SV calling pipeline from ONT data
2 stars 0 forks source link

nanovar error #1

Open C-YONG opened 4 months ago

C-YONG commented 4 months ago

Hello, I would like to ask you some questions. I plan to use nanovar for structural variation detection. I have three ONT datasets in total. The first two ran normally, but the third one encountered the following error (03.1_nanovar_call.sh). What could be the reason for this?

[24/07/2024 15:27:11] - NanoVar started [24/07/2024 15:27:11] - Checking integrity of input files - Pass [24/07/2024 15:27:12] - Analyzing read alignments and detecting SVs - Done [24/07/2024 15:52:39] - Clustering SV breakends - Done [24/07/2024 16:08:29] - Correcting DUP and detecting TE - Done [24/07/2024 16:13:00] - Generating VCF files and report - Traceback (most recent call last): File "/home/chiyong/miniconda3/envs/nanovar/bin/nanovar", line 635, in main() File "/home/chiyong/miniconda3/envs/nanovar/bin/nanovar", line 486, in main run.vcf_report() # ^^^^^^^^^^^^^^^^ File "/home/chiyong/miniconda3/envs/nanovar/lib/python3.11/site-packages/nanovar/nv_characterize.py", line 205, in vcf_report alt_seq = get_alt_seq(self.dir, self.out_nn, self.refpath) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/chiyong/miniconda3/envs/nanovar/lib/python3.11/site-packages/nanovar/nv_alt_seq.py", line 46, in get_alt_seq fasta = bed.sequence(fi=ref_path, nameOnly=True) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/chiyong/miniconda3/envs/nanovar/lib/python3.11/site-packages/pybedtools/bedtool.py", line 907, in decorated result = method(self, *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/chiyong/miniconda3/envs/nanovar/lib/python3.11/site-packages/pybedtools/bedtool.py", line 388, in wrapped stream = call_bedtools( ^^^^^^^^^^^^^^ File "/home/chiyong/miniconda3/envs/nanovar/lib/python3.11/site-packages/pybedtools/helpers.py", line 456, in call_bedtools raise BEDToolsError(subprocess.list2cmdline(cmds), stderr) pybedtools.helpers.BEDToolsError: Command was:

bedtools getfasta -nameOnly -fo /tmp/pybedtools.fvq2ujze.tmp -fi 03_genome/genome_NV/genome1.3.fna -bed /tmp/pybedtools.ml_5c4kf.tmp

Error message was: Feature (LR584250.1:14620732-14620733) beyond the length of LR584250.1 size (12913240 bp). Skipping.

Writing to /tmp/bcftools.VhKjGh

LaurieLecomte commented 4 months ago

Hi!

I ran into the exact same issue yesterday, after updating to NanoVar version 1.7.0. In fact, I replied to the issue#89 you opened in the NanoVar repo before noticing you had commented here as well.

I suspect it might have something to do with one of the latest updates, since I was able to run NanoVar 1.3.8 on the same dataset without any issue. I would thus suggest to wait for guidance from @cytham.

In the meantime, as a workaround or a temporary solution, my suggestion would be to revert to NanoVar 1.3.8 in a new conda environment. You could try to create a new environment from the file I pushed to the repo (conda env create –file NanoVar1.3.8_env.txt).

C-YONG commented 4 months ago

Thank you for your reply. I'll try it right away

C-YONG commented 4 months ago

I have a new problem. Do you have one too? [26/07/2024 17:32:06] - NanoVar started Checking integrity of input files - \ Indexing genome and aligning reads - \ Analyzing read alignments and detecting SVs - -2024-07-26 17:32:08.962744: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: SSE4.1 SSE4.2 AVX AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. Analyzing read alignments and detecting SVs - //home/chiyong/miniconda3/envs/nanovar2/lib/python3.8/site-packages/nanovar/nv_cov_upper.py:132: UserWarning: FixedFormatter should only be used together with FixedLocator ax.set_yticklabels(['{:,.1%}'.format(x) for x in vals]) Analyzing read alignments and detecting SVs - | Clustering SV breakends - - Re-evaluating SVs with BLAST and inferencing - |Traceback (most recent call last): File "/home/chiyong/miniconda3/envs/nanovar2/bin/nanovar", line 479, in main() File "/home/chiyong/miniconda3/envs/nanovar2/bin/nanovar", line 339, in main run.cluster_nn(add_out=sub_run.total_out) File "/home/chiyong/miniconda3/envs/nanovar2/lib/python3.8/site-packages/nanovar/nv_characterize.py", line 123, in cluster_nn cluster_outins, = sv_cluster(self.total_subdata, sub_out, self.buff, self.maxovl, self.mincov, self.contig, True, File "/home/chiyong/miniconda3/envs/nanovar2/lib/python3.8/site-packages/nanovar/nv_cluster.py", line 49, in sv_cluster readteam, infodict, classdict, mainclass, svsizedict = rangecollect(parse, File "/home/chiyong/miniconda3/envs/nanovar2/lib/python3.8/site-packages/nanovar/nv_cluster.py", line 165, in rangecollect readteam, mainclass = cluster(leftchrnamelist, leftchrcoorlist, rightchrnamelist, rightchrcoorlist, buf, svsizedict, File "/home/chiyong/miniconda3/envs/nanovar2/lib/python3.8/site-packages/nanovar/nv_cluster.py", line 270, in cluster raise Exception("Read %s has none or more than 2 clusters" % read) Exception: Read f3efa47a-21f0-42b8-90a7-62759f3b1849~34O1B has none or more than 2 clusters

LaurieLecomte commented 4 months ago

Hi! Yes, I remember having this issue with one of my bam files recently (PacBio ccs reads) when I ran NanoVar with 10 CPUs. I reduced to 5 CPUs and everything went smoothly, so I suspect it might be a memory issue, as suggested in #Issue 25 (but see also #Issue13).

You might thus want to try to edit the CPU variable to CPU=5 in the 03.1_nanovar_call.sh script.