cytham / nanovar

Structural variant caller for low-depth long-read sequencing data
GNU General Public License v3.0
45 stars 10 forks source link

Boundaries does not match error #1

Closed asmariyaz23 closed 4 years ago

asmariyaz23 commented 4 years ago

Hello,

I am running NanoVar, but run into the error below. Could you help me understand how this could be resolved?

nanovar -t 24 /scratch/try_pipeline/trimmed_reads/trimmed.fastq /scratch/RESOURCES/hg38/hg38.fa /scratch/try_pipeline/nanovar

[12/12/2019 07:51:06] - NanoVar started
Mapping reads and calling SVs - /Traceback (most recent call last):
  File "/scratch/autism_minion/Long-reads/util/nanovar", line 188, in <module>
    main()
  File "/scratch/autism_minion/Long-reads/util/nanovar", line 174, in main
    run.cluster_nn()
  File "/root/.pyenv/versions/3.6.6/lib/python3.6/site-packages/nanovar/nv_characterize.py", line 108, in cluster_nn
    self.maxovl, self.depth = ovl_upper(self.gsize, self.contig, self.basecov, self.total_subdata, self.dir)
  File "/root/.pyenv/versions/3.6.6/lib/python3.6/site-packages/nanovar/nv_cov_upper.py", line 62, in ovl_upper
    curve(data2, n, round((medad*6) + med, 0), wk_dir)
  File "/root/.pyenv/versions/3.6.6/lib/python3.6/site-packages/nanovar/nv_cov_upper.py", line 121, in curve
    spl = make_interp_spline(c, y)
  File "/root/.pyenv/versions/3.6.6/lib/python3.6/site-packages/scipy/interpolate/_bsplines.py", line 827, in make_interp_spline
    "match: expected %s, got %s+%s" % (nt-n, nleft, nright))
ValueError: The number of derivatives at boundaries does not match: expected 3, got 0+0

Here is a portion of the log file generated by Nanovar:

[12/12/2019 07:51:06] - INFO - Initialize NanoVar log file
[12/12/2019 07:51:06] - INFO - Version: NanoVar-1.2.6
[12/12/2019 07:51:06] - INFO - Command: /scratch/autism_minion/Long-reads/util/nanovar -t 24 /scratch/try_pipeline/trimmed_reads/trimmed.fastq /scratch/RESOURCES/hg38/hg38.fa /scratch/try_pipeline/nanovar
[12/12/2019 07:51:06] - DEBUG - Input FASTQ/FASTA file passed
[12/12/2019 07:51:06] - INFO - Reads: /scratch/try_pipeline/trimmed_reads/trimmed.fastq
[12/12/2019 07:51:06] - INFO - Reference genome: /scratch/RESOURCES/hg38/hg38.fa
[12/12/2019 07:51:06] - INFO - Working directory: /scratch/try_pipeline/nanovar
[12/12/2019 07:51:06] - INFO - Filter file: None
[12/12/2019 07:51:06] - INFO - Minimum SV len: 25
[12/12/2019 07:51:06] - INFO - Mapping percent for split-read: 0.05
[12/12/2019 07:51:06] - INFO - Length buffer for clustering: 50
[12/12/2019 07:51:06] - INFO - Score threshold: 2.6
[12/12/2019 07:51:06] - INFO - Number of threads: 23

[12/12/2019 07:51:06] - INFO - Total number of reads in FASTQ/FASTA: 8000

[12/12/2019 07:51:06] - INFO - NanoVar started
[12/12/2019 07:51:36] - DEBUG - Make blast index skipped
[12/12/2019 07:51:36] - DEBUG - Windowmasker counts skipped
[12/12/2019 07:51:36] - DEBUG - Windowmasker obinary skipped
[12/12/2019 07:51:36] - DEBUG - Make hs-blastn FMD-index skipped
[12/12/2019 07:51:36] - DEBUG - hs-blastn alignment skipped
[12/12/2019 07:51:36] - INFO - Parsing and detecting SVs
[12/12/2019 07:51:38] - INFO - Gap dictionary not loaded.
[12/12/2019 07:51:38] - INFO - Parsing blast output read 1
[12/12/2019 07:51:38] - INFO - Detecting SV for entry 1
[12/12/2019 07:51:38] - INFO - SV breakend not found, skipping entry 1
[12/12/2019 07:51:38] - INFO - Parsing blast output read 2
[12/12/2019 07:51:38] - INFO - Detecting SV for entry 2
[12/12/2019 07:51:38] - INFO - SV breakend not found, skipping entry 2
[12/12/2019 07:51:38] - INFO - Parsing blast output read 3
[12/12/2019 07:51:38] - INFO - Detecting SV for entry 3
[12/12/2019 07:51:38] - INFO - SV breakend not found, skipping entry 3
[12/12/2019 07:51:38] - INFO - Parsing blast output read 4
[12/12/2019 07:51:38] - INFO - Detecting SV for entry 4
[12/12/2019 07:51:38] - INFO - SV breakend not found, skipping entry 4
[12/12/2019 07:51:38] - INFO - Parsing blast output read 5
[12/12/2019 07:51:38] - INFO - Detecting SV for entry 5
[12/12/2019 07:51:38] - INFO - SV breakend not found, skipping entry 5
[12/12/2019 07:51:38] - INFO - Parsing blast output read 6
[12/12/2019 07:51:38] - INFO - Detecting SV for entry 6
[12/12/2019 07:51:38] - INFO - SV breakend not found, skipping entry 6
[12/12/2019 07:51:38] - INFO - Parsing blast output read 7
[12/12/2019 07:51:38] - INFO - Detecting SV for entry 7
[12/12/2019 07:51:38] - INFO - SV breakend not found, skipping entry 7
[12/12/2019 07:51:38] - INFO - Parsing blast output read 8
[12/12/2019 07:51:38] - INFO - Detecting SV for entry 8
[12/12/2019 07:51:38] - INFO - SV breakend not found, skipping entry 8
[12/12/2019 07:51:38] - INFO - Parsing blast output read 9
[12/12/2019 07:51:38] - INFO - Detecting SV for entry 9
[12/12/2019 07:51:38] - INFO - SV breakend not found, skipping entry 9
[12/12/2019 07:51:38] - INFO - Parsing blast output read 10
[12/12/2019 07:51:38] - INFO - Detecting SV for entry 10
[12/12/2019 07:51:38] - INFO - SV breakend not found, skipping entry 10
[12/12/2019 07:51:38] - INFO - Parsing blast output read 11
[12/12/2019 07:51:38] - INFO - Detecting SV for entry 11
[12/12/2019 07:51:38] - INFO - SV breakend not found, skipping entry 11
[12/12/2019 07:51:38] - INFO - Parsing blast output read 12
[12/12/2019 07:51:38] - INFO - Detecting SV for entry 12
[12/12/2019 07:51:38] - INFO - SV breakend not found, skipping entry 12
[12/12/2019 07:51:38] - INFO - Parsing blast output read 13

Here is a listing of files in nanovar output:

ls -lh
total 822M
drwxr-xr-x. 2 root root 4.0K Dec 12 07:15 fig
-rw-r--r--. 1 root root  12K Dec 12 07:51 genome.sizes
-rw-r--r--. 1 root root 378M Dec 12 07:40 hg38.counts
-rw-r--r--. 1 root root 440M Dec 12 07:41 hg38.counts.obinary
-rw-r--r--. 1 root root 1.3M Dec 12 07:42 NanoVar-121219-0715.log
-rw-r--r--. 1 root root 1.3M Dec 12 07:51 NanoVar-121219-0751.log
-rw-r--r--. 1 root root 2.4M Dec 12 07:41 trimmed-hg38-blast.tab

Thank you, Asma

cytham commented 4 years ago

Hi Asma,

Thank you for using NanoVar and reporting the issue.

This is a bug that happen when there is very low FASTA input. I will fix it in the next update in the coming days. Please reinstall and try the new version when it is released.

Thanks! cy

asmariyaz23 commented 4 years ago

Hi Cy,

Thank you for your response. Could you please let me know what is the minimum FASTA input (# of reads) that is required?

Best, Asma

cytham commented 4 years ago

Hi Asma, the minimum # of reads required depends on your read lengths because the bug is caused by very low total coverage of the reference genome....But anyway, I have fixed the bug in the new version 1.2.7. Please reinstall it and try. Thanks! cy

asmariyaz23 commented 4 years ago

Thank you Cy! I will check it out. I input the entire fastq file (and not just a portion of it) as is and NanoVar ran successfully. I have low coverage so your explanation makes sense.

cytham commented 4 years ago

You are welcome! Thanks for using the tool and feel free to contact me for any more problems.