wdecoster / NanoPlot

Plotting scripts for long read sequencing data
http://nanoplot.bioinf.be
MIT License
429 stars 47 forks source link

TypeError: slice indices must be integers or None or have an __index__ method #21

Closed chienchi closed 6 years ago

chienchi commented 6 years ago

Hi,

I try to run NanoPlot on a 1D Minion result but I have the following error. Could you please help me on this? Thanks.

I have installed the Python3 through anaconda and use pip to install NanoPlot with command pip install Nanoplot --upgrade

Below is the log file.

2017-10-30 16:50:39,458 NanoPlot 0.24.0 started with arguments Namespace(alength=False, bam=None, barcoded=False, color='#4CB391', downsample=None, drop_outliers=False, fastq=['Mock.all.pass.fastq'], fastq_minimal=None, fastq_rich=None, format='png', listcolors=False, loglength=False, maxlength=None, minqual=None, outdir='.', plots=['kde', 'hex', 'dot'], prefix='', readtype='1D', summary=None, threads=4, verbose=False)
2017-10-30 16:50:39,459 Python version is: 3.5.2 |Anaconda 4.1.1 (64-bit)| (default, Jul  2 2016, 17:53:06)  [GCC 4.4.7 20120313 (Red Hat 4.4.7-1)]
2017-10-30 16:50:39,462 Nanoplotter: valid output format png
2017-10-30 16:50:39,527 Nanoget: Starting to collect statistics from plain fastq file.
2017-10-30 17:13:50,741 Nanoget: Finished collecting statistics from plain fastq file.
2017-10-30 17:13:53,297 Nanoget: Gathered all metrics
2017-10-30 17:13:53,509 Calculated statistics
2017-10-30 17:13:53,509 Using sequenced read lengths for plotting.
2017-10-30 17:13:53,509 Processed the reads, optionally filtered. 869936 reads left
2017-10-30 17:13:53,509 Nanoplotter: Valid color #4CB391.
2017-10-30 17:13:53,584 Nanoplotter: Creating length plots for Read length.
2017-10-30 17:13:53,584 Nanoplotter: Using 869936 reads with read length N50 of 3636.
2017-10-30 17:13:53,871 slice indices must be integers or None or have an __index__ method
Traceback (most recent call last):
  File "/users/218819/scratch/apps/Anaconda3/lib/python3.5/site-packages/nanoplot/NanoPlot.py", line 61, in main
    make_plots(datadf, settings, args)
  File "/users/218819/scratch/apps/Anaconda3/lib/python3.5/site-packages/nanoplot/NanoPlot.py", line 260, in make_plots
    log=settings["logBool"])
  File "/users/218819/scratch/apps/Anaconda3/lib/python3.5/site-packages/nanoplotter/nanoplotter.py", line 250, in length_plots
    kde_kws={"label": name, "clip": (0, maxvalx)})
  File "/users/218819/scratch/apps/Anaconda3/lib/python3.5/site-packages/seaborn/distributions.py", line 224, in distplot
    kdeplot(a, vertical=vertical, ax=ax, color=kde_color, **kde_kws)
  File "/users/218819/scratch/apps/Anaconda3/lib/python3.5/site-packages/seaborn/distributions.py", line 657, in kdeplot
    cumulative=cumulative, **kwargs)
  File "/users/218819/scratch/apps/Anaconda3/lib/python3.5/site-packages/seaborn/distributions.py", line 273, in _univariate_kdeplot
    cumulative=cumulative)
  File "/users/218819/scratch/apps/Anaconda3/lib/python3.5/site-packages/seaborn/distributions.py", line 345, in _statsmodels_univariate_kde
    kde.fit(kernel, bw, fft, gridsize=gridsize, cut=cut, clip=clip)
  File "/users/218819/scratch/apps/Anaconda3/lib/python3.5/site-packages/statsmodels/nonparametric/kde.py", line 146, in fit
    clip=clip, cut=cut)
  File "/users/218819/scratch/apps/Anaconda3/lib/python3.5/site-packages/statsmodels/nonparametric/kde.py", line 506, in kdensityfft
    f = revrt(zstar)
  File "/users/218819/scratch/apps/Anaconda3/lib/python3.5/site-packages/statsmodels/nonparametric/kdetools.py", line 20, in revrt
    y = X[:m/2+1] + np.r_[0,X[m/2+1:],0]*1j
TypeError: slice indices must be integers or None or have an __index__ method
wdecoster commented 6 years ago

Hi chienchi,

Thanks for reporting this. It's not obvious to me what causes this error, but I came across these threads: https://github.com/mwaskom/seaborn/issues/1103 and https://github.com/mwaskom/seaborn/issues/1092 They're about the same error in seaborn, which nanoplotter uses to draw most of its plots.

Could you tell me the version you have installed of statsmodels and scipy? You can just execute these commands in your terminal: python -c "import statsmodels ; print(statsmodels.__version__) " python -c "import scipy ; print(scipy.__version__) "

Please have a look at the suggested fixes:

Update the statsmodels module: pip install -U statsmodels and if that doesn't work you could try to update the scipy module: pip install -U scipy

If I know the versions which are leading to the error I can make sure NanoPlot depends on the right one.

Hope this helps.

Cheers, Wouter

chienchi commented 6 years ago

Hi Wouter,

Thanks for the quick response and the great tool for the Nanopore data.

Sorry! I cannot report back with the versions of package statsmodels and scipy of the error. I removed the anaconda3 (just suspect anaconda3 has too many thing installed) and use conda to fresh install python3 with virtual env py36 created and pip install NanoPlot. And it ran with command successfully and generated the stats plot and summary text file. NanoPlot --fastq Mock.all.pass.fastq

In the virtual env py36, it seems no statsmodels package installed.?? Here is the commands report you suggested to run.

(py36) [218819@seq-fe3 Mock]$ python -c "import statsmodels ; print(statsmodels.__version__) "
Traceback (most recent call last):
  File "<string>", line 1, in <module>
ModuleNotFoundError: No module named 'statsmodels'
(py36) [218819@seq-fe3 Mock]$ python -c "import scipy ; print(scipy.__version__) "
1.0.0

There is another error message when I add --loglength flag to run under same env. NanoPlot -t 4 --loglength --fastq Mock.all.pass.fastq

2017-10-31 10:04:36,532 NanoPlot 0.24.0 started with arguments Namespace(alength=False, bam=None, barcoded=False, color='#4CB391', downsample=None, drop_outliers=False, fastq=['Mock.all.pass.fastq'], fastq_minimal=None, fastq_rich=None, format='png', listcolors=False, loglength=True, maxlength=None, minqual=None, outdir='.', plots=['kde', 'hex', 'dot'], prefix='', readtype='1D', summary=None, threads=4, verbose=False)
2017-10-31 10:04:36,533 Python version is: 3.6.2 |Continuum Analytics, Inc.| (default, Jul 20 2017, 13:51:32)  [GCC 4.4.7 20120313 (Red Hat 4.4.7-1)]
2017-10-31 10:04:36,535 Nanoplotter: valid output format png
2017-10-31 10:04:36,554 Nanoget: Starting to collect statistics from plain fastq file.
2017-10-31 10:27:34,788 Nanoget: Finished collecting statistics from plain fastq file.
2017-10-31 10:27:39,503 Nanoget: Gathered all metrics
2017-10-31 10:27:39,668 Calculated statistics
2017-10-31 10:27:39,669 Using sequenced read lengths for plotting.
2017-10-31 10:27:39,740 Using Log10 scaled read lengths.
2017-10-31 10:27:39,740 Processed the reads, optionally filtered. 869936 reads left
2017-10-31 10:27:39,741 Nanoplotter: Valid color #4CB391.
2017-10-31 10:27:39,818 Nanoplotter: Creating length plots for Read length.
2017-10-31 10:27:39,818 Nanoplotter: Using 869936 reads with read length N50 of 3636.
2017-10-31 10:27:51,658 `bins` should be a positive integer.
Traceback (most recent call last):
  File "/scratch-218819/apps/anaconda2/envs/py36/lib/python3.6/site-packages/nanoplot/NanoPlot.py", line 61, in main
    make_plots(datadf, settings, args)
  File "/scratch-218819/apps/anaconda2/envs/py36/lib/python3.6/site-packages/nanoplot/NanoPlot.py", line 260, in make_plots
    log=settings["logBool"])
  File "/scratch-218819/apps/anaconda2/envs/py36/lib/python3.6/site-packages/nanoplotter/nanoplotter.py", line 264, in length_plots
    color=color)
  File "/scratch-218819/apps/anaconda2/envs/py36/lib/python3.6/site-packages/seaborn/distributions.py", line 218, in distplot
    color=hist_color, **hist_kws)
  File "/scratch-218819/apps/anaconda2/envs/py36/lib/python3.6/site-packages/matplotlib/__init__.py", line 1710, in inner
    return func(ax, *args, **kwargs)
  File "/scratch-218819/apps/anaconda2/envs/py36/lib/python3.6/site-packages/matplotlib/axes/_axes.py", line 6207, in hist
    m, bins = np.histogram(x[i], bins, weights=w[i], **hist_kwargs)
  File "/scratch-218819/apps/anaconda2/envs/py36/lib/python3.6/site-packages/numpy/lib/function_base.py", line 717, in histogram
    '`bins` should be a positive integer.')
ValueError: `bins` should be a positive integer.
wdecoster commented 6 years ago

As far as I understood statsmodels is used if available, and else scipy is used. But the error is gone so that's nice.

Your new issue looks like one from earlier today https://github.com/wdecoster/NanoPlot/issues/22 See if updating nanoplotter solves your problem as well.

Cheers, Wouter

chienchi commented 6 years ago

Thanks. After update the nano plotter, the problem solved.

Chienchi

On Oct 31, 2017, at 1:02 PM, Wouter De Coster notifications@github.com<mailto:notifications@github.com> wrote:

As far as I understood statsmodels is used if available, and else scipy is used. But the error is gone so that's nice.

Your new this issue looks like one from earlier today #22https://github.com/wdecoster/NanoPlot/issues/22 See if updating nanoplotter solves your problem as well.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/wdecoster/NanoPlot/issues/21#issuecomment-340873405, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AAtBNVl0JkYKaxxt4RY9NeeYShrySpIeks5sx27FgaJpZM4QL_l3.

wdecoster commented 6 years ago

Good to hear. Please let me know if you come across other issues or have suggestions/feature requests.

smm19900210 commented 6 years ago

hi wdecoster, i met the same problem, my command is NanoPlot --fastq ../test.fastq --plots hex dot below is the log file

Traceback (most recent call last): File "/share/nas2/genome/biosoft/Python/3.4.3/bin/NanoPlot", line 11, in load_entry_point('NanoPlot==1.0.0', 'console_scripts', 'NanoPlot')() File "/share/nas2/genome/biosoft/Python/3.4.3/lib/python3.4/site-packages/NanoPlot-1.0.0-py3.4.egg/nanoplot/NanoPlot.py", line 62, in main nanomath.write_stats(datadf, settings["path"] + "NanoStats.txt") File "/share/nas2/genome/biosoft/Python/3.4.3/lib/python3.4/site-packages/nanomath/nanomath.py", line 136, in write_stats stats = [Stats(df) for df in datadfs] File "/share/nas2/genome/biosoft/Python/3.4.3/lib/python3.4/site-packages/nanomath/nanomath.py", line 136, in stats = [Stats(df) for df in datadfs] File "/share/nas2/genome/biosoft/Python/3.4.3/lib/python3.4/site-packages/nanomath/nanomath.py", line 31, in init self.number_of_bases = np.sum(df["lengths"]) TypeError: string indices must be integers