schmeing / gapless

Gapless provides combined scaffolding, gap-closing and assembly correction with long reads
MIT License
32 stars 4 forks source link

pipeline crashed : scaffold (key error) #10

Open margaretc-ho opened 1 year ago

margaretc-ho commented 1 year ago

Hi @schmeing

I've tried troubleshooting with different conda installations and I can't get Gapless to run on my assembly.

The command that fails and yields pipeline crashed : scaffold is gapless.sh -j 10 -i /hpcdata/bcbb/homc/Tcas_filtcontigs.fa -t pb_hifi /hpcdata/bcbb/homc/Tcas_PacBio/PACBIO_DATA/XECAF_20221215_S64018_PL100274063-1_A01.ccs.fastq

From gapless_scaffold.log:

0:00:04.577490 Reading in original assembly
0:00:04.777931 Loading repeats
0:00:05.460704 Filtering mappings
Traceback (most recent call last):
  File "/hpcdata/bcbb/homc/conda_envs/envs/gapless/lib/python3.10/site-packages/pandas/core/indexes/base.py", line 3621, in get_loc
    return self._engine.get_loc(casted_key)
  File "pandas/_libs/index.pyx", line 136, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 163, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 5198, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 5206, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'y'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/hpcdata/bcbb/homc/conda_envs/envs/gapless/bin/gapless.py", line 13327, in <module>
    main(sys.argv[1:])
  File "/hpcdata/bcbb/homc/conda_envs/envs/gapless/bin/gapless.py", line 13156, in main
    GaplessScaffold(args[0], args[1], args[2], min_mapq, min_mapping_length, min_length_contig_break, prefix, stats)
  File "/hpcdata/bcbb/homc/conda_envs/envs/gapless/bin/gapless.py", line 9069, in GaplessScaffold
    mappings, cov_counts, cov_probs, read_names, read_len = ReadMappings(mapping_file, contig_ids, min_mapq, min_mapping_length, keep_all_subreads, alignment_precision, num_read_len_groups, pdf)
  File "/hpcdata/bcbb/homc/conda_envs/envs/gapless/bin/gapless.py", line 427, in ReadMappings
    cov_probs = GetCoverageProbabilities(cov_counts, pdf)
  File "/hpcdata/bcbb/homc/conda_envs/envs/gapless/bin/gapless.py", line 386, in GetCoverageProbabilities
    PlotXY(pdf, "Coverage", "% Bins (Size: {})".format(bsize), probs.loc[selection2, 'count'], probs.loc[selection2, 'nbins']/probs.loc[selection2, 'nbin_sum']*100, linex=x_values, liney=nbinom.pmf(x_values,opt_par[0],opt_par[1])*100)
  File "/hpcdata/bcbb/homc/conda_envs/envs/gapless/bin/gapless.py", line 285, in PlotXY
    sns.lineplot(x=linex, y=liney, color='red', linewidth=2.5)
  File "/hpcdata/bcbb/homc/conda_envs/envs/gapless/lib/python3.10/site-packages/seaborn/relational.py", line 645, in lineplot
    p.plot(ax, kwargs)
  File "/hpcdata/bcbb/homc/conda_envs/envs/gapless/lib/python3.10/site-packages/seaborn/relational.py", line 459, in plot
    lines = ax.plot(sub_data["x"], sub_data["y"], **kws)
  File "/hpcdata/bcbb/homc/conda_envs/envs/gapless/lib/python3.10/site-packages/pandas/core/frame.py", line 3505, in __getitem__
    indexer = self.columns.get_loc(key)
  File "/hpcdata/bcbb/homc/conda_envs/envs/gapless/lib/python3.10/site-packages/pandas/core/indexes/base.py", line 3623, in get_loc
    raise KeyError(key) from err

Output from conda list:

# packages in environment at /hpcdata/bcbb/homc/conda_envs/envs/gapless:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                       2_gnu    conda-forge
alsa-lib                  1.2.8                h166bdaf_0    conda-forge
attr                      2.5.1                h166bdaf_1    conda-forge
biopython                 1.81            py310h1fa729e_0    conda-forge
brotli                    1.0.9                h166bdaf_8    conda-forge
brotli-bin                1.0.9                h166bdaf_8    conda-forge
bzip2                     1.0.8                h7f98852_4    conda-forge
ca-certificates           2023.5.7             hbcca054_0    conda-forge
cairo                     1.16.0            hbbf8b49_1016    conda-forge
certifi                   2023.5.7           pyhd8ed1ab_0    conda-forge
cycler                    0.11.0             pyhd8ed1ab_0    conda-forge
dbus                      1.13.6               h5008d03_3    conda-forge
expat                     2.5.0                hcb278e6_1    conda-forge
font-ttf-dejavu-sans-mono 2.37                 hab24e00_0    conda-forge
font-ttf-inconsolata      3.000                h77eed37_0    conda-forge
font-ttf-source-code-pro  2.038                h77eed37_0    conda-forge
font-ttf-ubuntu           0.83                 hab24e00_0    conda-forge
fontconfig                2.14.2               h14ed4e7_0    conda-forge
fonts-conda-ecosystem     1                             0    conda-forge
fonts-conda-forge         1                             0    conda-forge
fonttools                 4.39.4          py310h2372a71_0    conda-forge
freetype                  2.12.1               hca18f0e_1    conda-forge
gapless                   0.4                  hdfd78af_0    bioconda
gettext                   0.21.1               h27087fc_0    conda-forge
glib                      2.76.3               hfc55251_0    conda-forge
glib-tools                2.76.3               hfc55251_0    conda-forge
graphite2                 1.3.13            h58526e2_1001    conda-forge
gst-plugins-base          1.22.3               h938bd60_1    conda-forge
gstreamer                 1.22.3               h977cf35_1    conda-forge
harfbuzz                  7.3.0                hdb3a94d_0    conda-forge
icu                       72.1                 hcb278e6_0    conda-forge
k8                        0.2.5                hdcf5f25_4    bioconda
keyutils                  1.6.1                h166bdaf_0    conda-forge
kiwisolver                1.4.4           py310hbf28c38_1    conda-forge
krb5                      1.20.1               h81ceb04_0    conda-forge
lame                      3.100             h166bdaf_1003    conda-forge
lcms2                     2.15                 haa2dc70_1    conda-forge
ld_impl_linux-64          2.40                 h41732ed_0    conda-forge
lerc                      4.0.0                h27087fc_0    conda-forge
libblas                   3.9.0           16_linux64_openblas    conda-forge
libbrotlicommon           1.0.9                h166bdaf_8    conda-forge
libbrotlidec              1.0.9                h166bdaf_8    conda-forge
libbrotlienc              1.0.9                h166bdaf_8    conda-forge
libcap                    2.67                 he9d0100_0    conda-forge
libcblas                  3.9.0           16_linux64_openblas    conda-forge
libclang                  16.0.4          default_h1cdf331_0    conda-forge
libclang13                16.0.4          default_h4d60ac6_0    conda-forge
libcups                   2.3.3                h36d4200_3    conda-forge
libdeflate                1.18                 h0b41bf4_0    conda-forge
libedit                   3.1.20191231         he28a2e2_2    conda-forge
libevent                  2.1.12               h3358134_0    conda-forge
libexpat                  2.5.0                hcb278e6_1    conda-forge
libffi                    3.4.2                h7f98852_5    conda-forge
libflac                   1.4.2                h27087fc_0    conda-forge
libgcc-ng                 12.2.0              h65d4601_19    conda-forge
libgcrypt                 1.10.1               h166bdaf_0    conda-forge
libgfortran-ng            12.2.0              h69a702a_19    conda-forge
libgfortran5              12.2.0              h337968e_19    conda-forge
libglib                   2.76.3               hebfc3b9_0    conda-forge
libgomp                   12.2.0              h65d4601_19    conda-forge
libgpg-error              1.46                 h620e276_0    conda-forge
libiconv                  1.17                 h166bdaf_0    conda-forge
libjpeg-turbo             2.1.5.1              h0b41bf4_0    conda-forge
liblapack                 3.9.0           16_linux64_openblas    conda-forge
libllvm16                 16.0.4               h5cf9203_0    conda-forge
libnsl                    2.0.0                h7f98852_0    conda-forge
libogg                    1.3.4                h7f98852_1    conda-forge
libopenblas               0.3.21          pthreads_h78a6416_3    conda-forge
libopus                   1.3.1                h7f98852_1    conda-forge
libpng                    1.6.39               h753d276_0    conda-forge
libpq                     15.3                 hbcd7760_1    conda-forge
libsndfile                1.2.0                hb75c966_0    conda-forge
libsqlite                 3.42.0               h2797004_0    conda-forge
libstdcxx-ng              12.2.0              h46fd767_19    conda-forge
libsystemd0               253                  h8c4010b_1    conda-forge
libtiff                   4.5.0                ha587672_6    conda-forge
libuuid                   2.38.1               h0b41bf4_0    conda-forge
libvorbis                 1.3.7                h9c3ff4c_0    conda-forge
libwebp-base              1.3.0                h0b41bf4_0    conda-forge
libxcb                    1.15                 h0b41bf4_0    conda-forge
libxkbcommon              1.5.0                h5d7e998_3    conda-forge
libxml2                   2.11.4               h0d562d8_0    conda-forge
libzlib                   1.2.13               h166bdaf_4    conda-forge
lz4-c                     1.9.4                hcb278e6_0    conda-forge
matplotlib                3.5.2           py310hff52083_1    conda-forge
matplotlib-base           3.5.2           py310h5701ce4_1    conda-forge
minimap2                  2.26                 he4a0461_1    bioconda
mpg123                    1.31.3               hcb278e6_0    conda-forge
munkres                   1.1.4              pyh9f0ad1d_0    conda-forge
mysql-common              8.0.32               hf1915f5_2    conda-forge
mysql-libs                8.0.32               hca2cd23_2    conda-forge
ncurses                   6.3                  h27087fc_1    conda-forge
nspr                      4.35                 h27087fc_0    conda-forge
nss                       3.89                 he45b914_0    conda-forge
numpy                     1.22.3          py310h4ef5377_2    conda-forge
openjpeg                  2.5.0                hfec8fc6_2    conda-forge
openssl                   3.1.0                hd590300_3    conda-forge
packaging                 23.1               pyhd8ed1ab_0    conda-forge
pandas                    1.4.2           py310h769672d_2    conda-forge
patsy                     0.5.3              pyhd8ed1ab_0    conda-forge
pcre2                     10.40                hc3806b6_0    conda-forge
pillow                    9.5.0           py310h582fbeb_1    conda-forge
pip                       23.1.2             pyhd8ed1ab_0    conda-forge
pixman                    0.40.0               h36c2ea0_0    conda-forge
ply                       3.11                       py_1    conda-forge
pthread-stubs             0.4               h36c2ea0_1001    conda-forge
pulseaudio-client         16.1                 h5195f5e_3    conda-forge
pyparsing                 3.0.9              pyhd8ed1ab_0    conda-forge
pyqt                      5.15.7          py310hab646b1_3    conda-forge
pyqt5-sip                 12.11.0         py310heca2aa9_3    conda-forge
python                    3.10.2          hc74c709_4_cpython    conda-forge
python-dateutil           2.8.2              pyhd8ed1ab_0    conda-forge
python_abi                3.10                    3_cp310    conda-forge
pytz                      2023.3             pyhd8ed1ab_0    conda-forge
qt-main                   5.15.8              h01ceb2d_12    conda-forge
racon                     1.5.0                h21ec9f0_2    bioconda
readline                  8.2                  h8228510_1    conda-forge
scipy                     1.8.0           py310hea5193d_1    conda-forge
seaborn                   0.12.2               hd8ed1ab_0    conda-forge
seaborn-base              0.12.2             pyhd8ed1ab_0    conda-forge
seqtk                     1.4                  h7132678_0    bioconda
setuptools                67.7.2             pyhd8ed1ab_0    conda-forge
sip                       6.7.9           py310hc6cd4ac_0    conda-forge
six                       1.16.0             pyh6c4a22f_0    conda-forge
sqlite                    3.42.0               h2c6b66d_0    conda-forge
statsmodels               0.14.0          py310h278f3c1_1    conda-forge
tk                        8.6.12               h27826a3_0    conda-forge
toml                      0.10.2             pyhd8ed1ab_0    conda-forge
tomli                     2.0.1              pyhd8ed1ab_0    conda-forge
tornado                   6.3.2           py310h2372a71_0    conda-forge
typing_extensions         4.6.2              pyha770c72_0    conda-forge
tzdata                    2023c                h71feb2d_0    conda-forge
unicodedata2              15.0.0          py310h5764c6d_0    conda-forge
wheel                     0.40.0             pyhd8ed1ab_0    conda-forge
xcb-util                  0.4.0                hd590300_1    conda-forge
xcb-util-image            0.4.0                h8ee46fc_1    conda-forge
xcb-util-keysyms          0.4.0                h8ee46fc_1    conda-forge
xcb-util-renderutil       0.3.9                hd590300_1    conda-forge
xcb-util-wm               0.4.1                h8ee46fc_1    conda-forge
xkeyboard-config          2.38                 h0b41bf4_0    conda-forge
xorg-kbproto              1.0.7             h7f98852_1002    conda-forge
xorg-libice               1.0.10               h7f98852_0    conda-forge
xorg-libsm                1.2.3             hd9c2040_1000    conda-forge
xorg-libx11               1.8.4                h8ee46fc_1    conda-forge
xorg-libxau               1.0.11               hd590300_0    conda-forge
xorg-libxdmcp             1.1.3                h7f98852_0    conda-forge
xorg-libxext              1.3.4                h0b41bf4_2    conda-forge
xorg-libxrender           0.9.10            h7f98852_1003    conda-forge
xorg-renderproto          0.11.1            h7f98852_1002    conda-forge
xorg-xextproto            7.3.0             h0b41bf4_1003    conda-forge
xorg-xf86vidmodeproto     2.3.1             h7f98852_1002    conda-forge
xorg-xproto               7.0.31            h7f98852_1007    conda-forge
xz                        5.2.6                h166bdaf_0    conda-forge
zlib                      1.2.13               h166bdaf_4    conda-forge
zstd                      1.5.2                h3eb15da_6    conda-forge

Install was done via the following:

conda create -c conda-forge --name gapless python=3.10.2 pandas=1.4.2 numpy=1.22.3 scipy=1.8.0 seaborn matplotlib=3.5.2 pillow biopython
conda activate gapless
conda install -c bioconda gapless
conda install -c bioconda minimap2 racon seqtk

Do you have any recommendations? The failure seems to be at matplotlib but I have tried both the default latest as well as one that was recommended in someone else's issue (different error https://github.com/schmeing/gapless/issues/1).

Please let me know. This is a very important tool for us to get working. Thanks

M

pjm43 commented 1 year ago

I'm getting the same error. Any solutions??

Thanks, Jeff

pjm43 commented 1 year ago

Sorry I should have provide the error message in the gapless_scaffold.log:

`0:00:04.096412 Reading in original assembly 0:00:04.141333 Loading repeats 0:00:04.152210 Filtering mappings Traceback (most recent call last): File "/fslgroup/fslg_pws_module/compute/.conda-pws/envs/gapless/lib/python3.11/site-packages/pandas/core/indexes/base.py", line 3653, in get_loc return self._engine.get_loc(casted_key) File "pandas/_libs/index.pyx", line 147, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/index.pyx", line 176, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/hashtable_class_helper.pxi", line 7080, in pandas._libs.hashtable.PyObjectHashTable.get_item File "pandas/_libs/hashtable_class_helper.pxi", line 7088, in pandas._libs.hashtable.PyObjectHashTable.get_item KeyError: 'y'

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/fslgroup/fslg_pws_module/compute/.conda-pws/envs/gapless/bin/gapless.py", line 13327, in main(sys.argv[1:]) File "/fslgroup/fslg_pws_module/compute/.conda-pws/envs/gapless/bin/gapless.py", line 13156, in main GaplessScaffold(args[0], args[1], args[2], min_mapq, min_mapping_length, min_length_contig_break, prefix, stats) File "/fslgroup/fslg_pws_module/compute/.conda-pws/envs/gapless/bin/gapless.py", line 9069, in GaplessScaffold mappings, cov_counts, cov_probs, read_names, read_len = ReadMappings(mapping_file, contig_ids, min_mapq, min_mapping_length, keep_all_subreads, alignment_precision, num_read_len_groups, pdf) File "/fslgroup/fslg_pws_module/compute/.conda-pws/envs/gapless/bin/gapless.py", line 427, in ReadMappings cov_probs = GetCoverageProbabilities(cov_counts, pdf) File "/fslgroup/fslg_pws_module/compute/.conda-pws/envs/gapless/bin/gapless.py", line 386, in GetCoverageProbabilities PlotXY(pdf, "Coverage", "% Bins (Size: {})".format(bsize), probs.loc[selection2, 'count'], probs.loc[selection2, 'nbins']/probs.loc[selection2, 'nbin_sum']100, linex=x_values, liney=nbinom.pmf(x_values,opt_par[0],opt_par[1])100) File "/fslgroup/fslg_pws_module/compute/.conda-pws/envs/gapless/bin/gapless.py", line 285, in PlotXY sns.lineplot(x=linex, y=liney, color='red', linewidth=2.5) File "/fslgroup/fslg_pws_module/compute/.conda-pws/envs/gapless/lib/python3.11/site-packages/seaborn/relational.py", line 645, in lineplot p.plot(ax, kwargs) File "/fslgroup/fslg_pws_module/compute/.conda-pws/envs/gapless/lib/python3.11/site-packages/seaborn/relational.py", line 459, in plot lines = ax.plot(sub_data["x"], sub_data["y"], **kws) File "/fslgroup/fslg_pws_module/compute/.conda-pws/envs/gapless/lib/python3.11/site-packages/pandas/core/frame.py", line 3761, in getitem indexer = self.columns.get_loc(key) File "/fslgroup/fslg_pws_module/compute/.conda-pws/envs/gapless/lib/python3.11/site-packages/pandas/core/indexes/base.py", line 3655, in get_loc raise KeyError(key) from err KeyError: 'y'

Thanks

Jeff