jiarong / VirSorter2

customizable pipeline to identify viral sequences from (meta)genomic data
GNU General Public License v2.0
210 stars 28 forks source link

VirSorter2 error in hmmserch step #172

Open huajiachicat opened 10 months ago

huajiachicat commented 10 months ago

Hello Jiarong: I installed and tested the developer's version (Option 2) Virsorter2 on HPC. It worked perfectly fine. However when I want to do viral search on my own dataset. It reports error at HMMsearch. Here is the part of the run log:

[2023-10-12 18:16 INFO] # of seqs < 500 bp and removed: 196598
[2023-10-12 18:16 INFO] # of circular seqs: 319
[2023-10-12 18:16 INFO] # of linear seqs  : 169372
[2023-10-12 18:16 INFO] Finish spliting circular contig file with common rbs
[2023-10-12 18:16 INFO] Finish spliting circular contig file with NCLDV rbs
[2023-10-12 18:16 INFO] Finish spliting linear contig file with common rbs
[2023-10-12 18:16 INFO] Finish spliting linear contig file with NCLDV rbs
[2023-10-12 18:18 INFO] Step 1 - preprocess finished.
/usr/bin/bash: line 36:  8674 Aborted                 (core dumped) hmmsearch -T 30 --tblout iter-0/all.pdg.faa.splitdir/all.pdg.faa.ss.0.split.Viruses.splithmmtbl --cpu 2 --noali $Hmmdb $Tmp/$Bname > /dev/null 2>> iter-0/all.pdg.faa.splitdir/all.pdg.faa.ss.0.split.Viruses.splithmmtbl.log
[2023-10-12 21:44 ERROR] See error details in /fs/scratch/POS0103/virsort2_analysis/apis_virsorter.out/iter-0/all.pdg.faa.splitdir/all.pdg.faa.ss.0.split.Viruses.splithmmtbl.log
[Thu Oct 12 21:44:51 2023]
Error in rule hmmsearch:
    jobid: 246
    output: iter-0/all.pdg.faa.splitdir/all.pdg.faa.ss.0.split.Viruses.splithmmtbl
    conda-env: /users/POS0103/zzhao/local/virsorter2/db/conda_envs/15ce8fd2

shell:

        Domain=Viruses
        if [ $Domain = "Viruses" ]; then
            Hmmdb=/users/POS0103/zzhao/local/virsorter2/db/hmm/viral/combined.hmm
        else
            Domain2=$Domain
            if [ $Domain2 = "Pfamviruses" ]; then
                Domain2=Viruses
            fi
            Hmmdb=/users/POS0103/zzhao/local/virsorter2/db/hmm/pfam/Pfam-A-"$Domain2".hmm
        fi

        Bname=$(basename iter-0/all.pdg.faa.splitdir/all.pdg.faa.ss.0.split)
        To_scratch=false
        # move the heavy IO of hmmsearch in local scratch if possible
        if [ -d "/tmp" ]; then
            # not sure df or du are compatible in all linux; use "||To_scratch=false"
            #   to prevent imcompatibility in some linux distro
            Tmp=$(mktemp -d /tmp/vs2-XXXXXXXXXXXX) && To_scratch=true || To_scratch=false
            Avail=$(df -P /tmp | awk 'END{print $4}') || To_scratch=false
            Fsize=$(du -k iter-0/all.pdg.faa.splitdir/all.pdg.faa.ss.0.split | awk '{print $1*5}') || To_scratch=false
            if [ "$Avail" -gt "$Fsize" ] && [ "$To_scratch" = "true" ]; then
                cp iter-0/all.pdg.faa.splitdir/all.pdg.faa.ss.0.split $Tmp/$Bname || To_scratch=false
            else
                To_scratch=false
            fi
        fi

        hmmsearch -h | grep '^# HMMER' > iter-0/all.pdg.faa.splitdir/all.pdg.faa.ss.0.split.Viruses.splithmmtbl.log
        if [ "$To_scratch" = false ]; then
            # local scratch not set or not enough space in local scratch
            hmmsearch -T 30 --tblout iter-0/all.pdg.faa.splitdir/all.pdg.faa.ss.0.split.Viruses.splithmmtbl --cpu 2 --noali $Hmmdb iter-0/all.pdg.faa.splitdir/all.pdg.faa.ss.0.split > /dev/null 2>> iter-0/all.pdg.faa.splitdir/all.pdg.faa.ss.0.split.Viruses.splithmmtbl.log || { echo "See error details in /fs/scratch/POS0103/virsort2_analysis/apis_virsorter.out/iter-0/all.pdg.faa.splitdir/all.pdg.faa.ss.0.split.Viruses.splithmmtbl.log" | python /users/POS0103/zzhao/local/virsorter2/VirSorter2/virsorter/./scripts/echo.py --level error; exit 1; }
        else
            # when To_scratch is true, Tmp and Bname should have been defined successfully
            hmmsearch -T 30 --tblout iter-0/all.pdg.faa.splitdir/all.pdg.faa.ss.0.split.Viruses.splithmmtbl --cpu 2 --noali $Hmmdb $Tmp/$Bname > /dev/null 2>> iter-0/all.pdg.faa.splitdir/all.pdg.faa.ss.0.split.Viruses.splithmmtbl.log || { echo "See error details in /fs/scratch/POS0103/virsort2_analysis/apis_virsorter.out/iter-0/all.pdg.faa.splitdir/all.pdg.faa.ss.0.split.Viruses.splithmmtbl.log" | python /users/POS0103/zzhao/local/virsorter2/VirSorter2/virsorter/./scripts/echo.py --level error; exit 1; }

            rm -f $Tmp/$Bname && rmdir $Tmp
        fi
        rm -f iter-0/all.pdg.faa.splitdir/all.pdg.faa.ss.0.split.Viruses.splithmmtbl.log

        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

/usr/bin/bash: line 36:  2611 Aborted                 (core dumped) hmmsearch -T 30 --tblout iter-0/all.pdg.faa.splitdir/all.pdg.faa.ss.3.split.Viruses.splithmmtbl --cpu 2 --noali $Hmmdb $Tmp/$Bname > /dev/null 2>> iter-0/all.pdg.faa.splitdir/all.pdg.faa.ss.3.split.Viruses.splithmmtbl.log

So I check the log file: /fs/scratch/POS0103/virsort2_analysis/apis_virsorter.out/iter-0/all.pdg.faa.splitdir/all.pdg.faa.ss.0.split.Viruses.splithmmtbl.log, here is the content:

# HMMER 3.3.2 (Nov 2020); http://hmmer.org/ *** Error in 'hmmsearch': malloc(): memory corruption: 0x000055e82f0d4be0 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x82b36)[0x2b526d00ab36]
/lib64/libc.so.6(__libc_malloc+0x4c)[0x2b526d00d78c]
/lib64/libc.so.6(__strdup+0x1a)[0x2b526d014b8a]
/usr/local/xalt/xalt/lib64/libxalt_init.so(+0x11493)[0x2b526c638493]
/usr/local/xalt/xalt/lib64/libxalt_init.so(__XALT_json_add_ptA_xalt_1_5+0x695)[0x2b526c63bb15]
/usr/local/xalt/xalt/lib64/libxalt_init.so(__XALT_run_submission_xalt_1_5+0x75b)[0x2b526c63039b]
/usr/local/xalt/xalt/lib64/libxalt_init.so(__XALT_myfini_LD_PRELOAD_xalt_1_5+0x44c)[0x2b526c62ef0c]
/lib64/ld-linux-x86-64.so.2(+0x1008a)[0x2b526c41308a]
/lib64/libc.so.6(+0x39ce9)[0x2b526cfc1ce9]
/lib64/libc.so.6(+0x39d37)[0x2b526cfc1d37]
/lib64/libc.so.6(__libc_start_main+0xfc)[0x2b526cfaa55c]
hmmsearch(+0x5491)[0x55e82ec77491]
======= Memory map: ========
2b526c403000-2b526c425000 r-xp 00000000 00:24 3367649610                 /usr/lib64/ld-2.17.so
2b526c425000-2b526c428000 rw-p 00000000 00:00 0
2b526c42a000-2b526c42b000 rw-p 00000000 00:00 0
2b526c42b000-2b526c42f000 r--p 00000000 00:32 172110                     /users/POS0103/zzhao/local/virsorter2/db/conda_envs/15ce8fd2/lib/libgcc_s.so.1
2b526c42f000-2b526c441000 r-xp 00004000 00:32 172110                     /users/POS0103/zzhao/local/virsorter2/db/conda_envs/15ce8fd2/lib/libgcc_s.so.1
2b526c441000-2b526c444000 r--p 00016000 00:32 172110                     /users/POS0103/zzhao/local/virsorter2/db/conda_envs/15ce8fd2/lib/libgcc_s.so.1
2b526c444000-2b526c445000 r--p 00019000 00:32 172110                     /users/POS0103/zzhao/local/virsorter2/db/conda_envs/15ce8fd2/lib/libgcc_s.so.1
2b526c445000-2b526c446000 rw-p 0001a000 00:32 172110                     /users/POS0103/zzhao/local/virsorter2/db/conda_envs/15ce8fd2/lib/libgcc_s.so.1
2b526c472000-2b526c476000 rw-p 00000000 00:00 0
2b526c624000-2b526c625000 r--p 00021000 00:24 3367649610                 /usr/lib64/ld-2.17.so
2b526c625000-2b526c626000 rw-p 00022000 00:24 3367649610                 /usr/lib64/ld-2.17.so
2b526c626000-2b526c627000 rw-p 00000000 00:00 0
2b526c627000-2b526c65f000 r-xp 00000000 00:2c 24659366                   /usr/local/xalt/2.10.29/lib64/libxalt_init.so
2b526c65f000-2b526c85f000 ---p 00038000 00:2c 24659366                   /usr/local/xalt/2.10.29/lib64/libxalt_init.so
2b526c85f000-2b526c860000 r--p 00038000 00:2c 24659366                   /usr/local/xalt/2.10.29/lib64/libxalt_init.so
2b526c860000-2b526c861000 rw-p 00039000 00:2c 24659366                   /usr/local/xalt/2.10.29/lib64/libxalt_init.so
2b526c861000-2b526c862000 rw-p 00000000 00:00 0
2b526c862000-2b526c869000 r-xp 00000000 00:24 3366951675                 /usr/lib64/librt-2.17.so
2b526c869000-2b526ca68000 ---p 00007000 00:24 3366951675                 /usr/lib64/librt-2.17.so
2b526ca68000-2b526ca69000 r--p 00006000 00:24 3366951675                 /usr/lib64/librt-2.17.so
2b526ca69000-2b526ca6a000 rw-p 00007000 00:24 3366951675                 /usr/lib64/librt-2.17.so
2b526ca6a000-2b526cb6b000 r-xp 00000000 00:24 3367696441                 /usr/lib64/libm-2.17.so
2b526cb6b000-2b526cd6a000 ---p 00101000 00:24 3367696441                 /usr/lib64/libm-2.17.so
2b526cd6a000-2b526cd6b000 r--p 00100000 00:24 3367696441                 /usr/lib64/libm-2.17.so
2b526cd6b000-2b526cd6c000 rw-p 00101000 00:24 3367696441                 /usr/lib64/libm-2.17.so
2b526cd6c000-2b526cd83000 r-xp 00000000 00:24 3366951671                 /usr/lib64/libpthread-2.17.so
2b526cd83000-2b526cf82000 ---p 00017000 00:24 3366951671                 /usr/lib64/libpthread-2.17.so
2b526cf82000-2b526cf83000 r--p 00016000 00:24 3366951671                 /usr/lib64/libpthread-2.17.so
2b526cf83000-2b526cf84000 rw-p 00017000 00:24 3366951671                 /usr/lib64/libpthread-2.17.so
2b526cf84000-2b526cf88000 rw-p 00000000 00:00 0
2b526cf88000-2b526d14c000 r-xp 00000000 00:24 3367925826                 /usr/lib64/libc-2.17.so
2b526d14c000-2b526d34b000 ---p 001c4000 00:24 3367925826                 /usr/lib64/libc-2.17.so
2b526d34b000-2b526d34f000 r--p 001c3000 00:24 3367925826                 /usr/lib64/libc-2.17.so
2b526d34f000-2b526d351000 rw-p 001c7000 00:24 3367925826                 /usr/lib64/libc-2.17.so
2b526d351000-2b526d356000 rw-p 00000000 00:00 0
2b526d356000-2b526d358000 r-xp 00000000 00:24 3367343548                 /usr/lib64/libdl-2.17.so
2b526d358000-2b526d558000 ---p 00002000 00:24 3367343548                 /usr/lib64/libdl-2.17.so
2b526d558000-2b526d559000 r--p 00002000 00:24 3367343548                 /usr/lib64/libdl-2.17.so
2b526d559000-2b526d55a000 rw-p 00003000 00:24 3367343548                 /usr/lib64/libdl-2.17.so
2b526d55a000-2b526d55e000 r-xp 00000000 00:2c 17981579                   /usr/local/xalt/2.10.29/lib64/libuuid.so.1.3.0
2b526d55e000-2b526d75d000 ---p 00004000 00:2c 17981579                   /usr/local/xalt/2.10.29/lib64/libuuid.so.1.3.0
2b526d75d000-2b526d75e000 r--p 00003000 00:2c 17981579                   /usr/local/xalt/2.10.29/lib64/libuuid.so.1.3.0
2b526d75e000-2b526d75f000 rw-p 00004000 00:2c 17981579                   /usr/local/xalt/2.10.29/lib64/libuuid.so.1.3.0
2b526d75f000-2b526d760000 ---p 00000000 00:00 0
2b526d760000-2b526d960000 rw-p 00000000 00:00 0
2b526da34000-2b526da35000 ---p 00000000 00:00 0
2b526da35000-2b526dc35000 rw-p 00000000 00:00 0
2b5270000000-2b527011e000 rw-p 00000000 00:00 0
2b527011e000-2b5274000000 ---p 00000000 00:00 0
2b5274000000-2b52740e7000 rw-p 00000000 00:00 0
2b52740e7000-2b5278000000 ---p 00000000 00:00 0
55e82ec72000-55e82ec75000 r--p 00000000 00:32 150302                     /users/POS0103/zzhao/local/virsorter2/db/conda_envs/15ce8fd2/bin/hmmsearch
55e82ec75000-55e82ecda000 r-xp 00003000 00:32 150302                     /users/POS0103/zzhao/local/virsorter2/db/conda_envs/15ce8fd2/bin/hmmsearch
55e82ecda000-55e82ecf0000 r--p 00068000 00:32 150302                     /users/POS0103/zzhao/local/virsorter2/db/conda_envs/15ce8fd2/bin/hmmsearch
55e82ecf0000-55e82ecf1000 r--p 0007e000 00:32 150302                     /users/POS0103/zzhao/local/virsorter2/db/conda_envs/15ce8fd2/bin/hmmsearch
55e82ecf1000-55e82ecf2000 rw-p 0007f000 00:32 150302                     /users/POS0103/zzhao/local/virsorter2/db/conda_envs/15ce8fd2/bin/hmmsearch
55e82ecf2000-55e82ed02000 rw-p 00000000 00:00 0
55e82ee37000-55e832918000 rw-p 00000000 00:00 0                          [heap]
7fff0cae2000-7fff0cb07000 rw-p 00000000 00:00 0                          [stack]
7fff0cbe7000-7fff0cbe9000 r-xp 00000000 00:00 0                          [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]
huajiachicat commented 10 months ago

Is this issue caused by not enough memory? Since I run on HPC managed by Slurm, I allocated 1 node, 24 cores (CPUs), and 1T memory to this job. The VirSorter 2 is the most updated version so far.

jiarong commented 10 months ago

Hi, this issue has been seen before. It It happens when multiple hmmsearch processes run concurrently in certain OS, likely something to do hmmsearch memory management, but nothing to do with no enough allocated memory. The only solution I have now is to use the container (installation option 3).

huajiachicat commented 10 months ago

Hello:

Following your instruction, I downloaded the virsorter2.sif using singularity build virsorter2.sif docker://jiarong/virsorter:latestcommand. I did a test run and found another issue arises. This time the problem happened at Step 3.

Here is the error message (step3 only):

[2023-10-15 18:08 INFO] Step 2 - extract-feature finished.
[2023-10-15 18:08 ERROR] See error details in /users/POS0103/zzhao/local/virsorter2/testsif.out/log/iter-0/step3-classify/all-score-dsDNAphage.log
[Sun Oct 15 18:08:51 2023]
Error in rule classify_by_group:
    jobid: 56
    output: iter-0/dsDNAphage/all.pdg.clf
    conda-env: /users/POS0103/zzhao/local/virsorter2/db/conda_envs/63e87a79
    shell:

        Log=/users/POS0103/zzhao/local/virsorter2/testsif.out/log/iter-0/step3-classify/all-score-dsDNAphage.log
        python /usr/local/lib/python3.9/site-packages/virsorter/./scripts/classify.py iter-0/dsDNAphage/all.pdg.ftr /users/POS0103/zzhao/local/virsorter2/db/group/dsDNAphage/model dsDNAphage iter-0/dsDNAphage/all.pdg.clf 2> $Log || { echo "See error details in $Log" | python /usr/local/lib/python3.9/site-packages/virsorter/./scripts/echo.py --level error; exit 1; }

        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

[2023-10-15 18:08 ERROR] See error details in /users/POS0103/zzhao/local/virsorter2/testsif.out/log/iter-0/step3-classify/all-score-ssDNA.log
[Sun Oct 15 18:08:51 2023]
Error in rule classify_by_group:
    jobid: 57
    output: iter-0/ssDNA/all.pdg.clf
    conda-env: /users/POS0103/zzhao/local/virsorter2/db/conda_envs/63e87a79
    shell:

        Log=/users/POS0103/zzhao/local/virsorter2/testsif.out/log/iter-0/step3-classify/all-score-ssDNA.log
        python /usr/local/lib/python3.9/site-packages/virsorter/./scripts/classify.py iter-0/ssDNA/all.pdg.ftr /users/POS0103/zzhao/local/virsorter2/db/group/ssDNA/model ssDNA iter-0/ssDNA/all.pdg.clf 2> $Log || { echo "See error details in $Log" | python /usr/local/lib/python3.9/site-packages/virsorter/./scripts/echo.py --level error; exit 1; }

        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

Exiting because a job execution failed. Look above for error message

*** An error occurred. Detailed errors may not be printed for certain rules. Refer to the log file of the failed command for troubleshooting

I looked at the error message. Here is the content:


Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/virsorter/./scripts/classify.py", line 77, in <module>
    main()
  File "/usr/local/lib/python3.9/site-packages/virsorter/./scripts/classify.py", line 60, in main
    model = joblib.load(model_f)
  File "/users/POS0103/zzhao/local/virsorter2/db/conda_envs/63e87a79/lib/python3.8/site-packages/joblib/numpy_pickle.py", line 658, in load
    obj = _unpickle(fobj, filename, mmap_mode)
  File "/users/POS0103/zzhao/local/virsorter2/db/conda_envs/63e87a79/lib/python3.8/site-packages/joblib/numpy_pickle.py", line 577, in _unpickle
    obj = unpickler.load()
  File "/users/POS0103/zzhao/local/virsorter2/db/conda_envs/63e87a79/lib/python3.8/pickle.py", line 1212, in load
    dispatch[key[0]](self)
  File "/users/POS0103/zzhao/local/virsorter2/db/conda_envs/63e87a79/lib/python3.8/pickle.py", line 1537, in load_stack_global
    self.append(self.find_class(module, name))
  File "/users/POS0103/zzhao/local/virsorter2/db/conda_envs/63e87a79/lib/python3.8/pickle.py", line 1579, in find_class
    __import__(module, level=0)
  File "/users/POS0103/zzhao/local/virsorter2/db/conda_envs/63e87a79/lib/python3.8/site-packages/sklearn/ensemble/__init__.py", line 7, in <module>
    from ._forest import RandomForestClassifier
  File "/users/POS0103/zzhao/local/virsorter2/db/conda_envs/63e87a79/lib/python3.8/site-packages/sklearn/ensemble/_forest.py", line 56, in <module>
    from ..tree import (DecisionTreeClassifier, DecisionTreeRegressor,
  File "/users/POS0103/zzhao/local/virsorter2/db/conda_envs/63e87a79/lib/python3.8/site-packages/sklearn/tree/__init__.py", line 6, in <module>
    from ._classes import BaseDecisionTree
  File "/users/POS0103/zzhao/local/virsorter2/db/conda_envs/63e87a79/lib/python3.8/site-packages/sklearn/tree/_classes.py", line 40, in <module>
    from ._criterion import Criterion
  File "sklearn/tree/_splitter.pxd", line 34, in init sklearn.tree._criterion
  File "sklearn/tree/_tree.pxd", line 37, in init sklearn.tree._splitter
  File "sklearn/neighbors/_quad_tree.pxd", line 55, in init sklearn.tree._tree
  File "/users/POS0103/zzhao/local/virsorter2/db/conda_envs/63e87a79/lib/python3.8/site-packages/sklearn/neighbors/__init__.py", line 17, in <module>
    from ._nca import NeighborhoodComponentsAnalysis
  File "/users/POS0103/zzhao/local/virsorter2/db/conda_envs/63e87a79/lib/python3.8/site-packages/sklearn/neighbors/_nca.py", line 22, in <module>
    from ..decomposition import PCA
  File "/users/POS0103/zzhao/local/virsorter2/db/conda_envs/63e87a79/lib/python3.8/site-packages/sklearn/decomposition/__init__.py", line 17, in <module>
    from .dict_learning import dict_learning
  File "/users/POS0103/zzhao/local/virsorter2/db/conda_envs/63e87a79/lib/python3.8/site-packages/sklearn/decomposition/dict_learning.py", line 4, in <module>
    from . import _dict_learning
  File "/users/POS0103/zzhao/local/virsorter2/db/conda_envs/63e87a79/lib/python3.8/site-packages/sklearn/decomposition/_dict_learning.py", line 21, in <module>
    from ..linear_model import Lasso, orthogonal_mp_gram, LassoLars, Lars
  File "/users/POS0103/zzhao/local/virsorter2/db/conda_envs/63e87a79/lib/python3.8/site-packages/sklearn/linear_model/__init__.py", line 12, in <module>
    from ._least_angle import (Lars, LassoLars, lars_path, lars_path_gram, LarsCV,
  File "/users/POS0103/zzhao/local/virsorter2/db/conda_envs/63e87a79/lib/python3.8/site-packages/sklearn/linear_model/_least_angle.py", line 30, in <module>
    method='lar', copy_X=True, eps=np.finfo(np.float).eps,
  File "/users/POS0103/zzhao/local/virsorter2/db/conda_envs/63e87a79/lib/python3.8/site-packages/numpy/__init__.py", line 305, in __getattr__
    raise AttributeError(__former_attrs__[attr])
AttributeError: module 'numpy' has no attribute 'float'.
`np.float` was a deprecated alias for the builtin `float`. To avoid this error in existing code, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
    https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations

It seems that it has a numpy version confiction.

Here is the problem: If I use option 2 develop version, then I will have a Hmmsearch issue. If I use the sif version, the numpy will have issue.

I run the sif version under VS2 python virtual environment. I tried to run it without activate VS2, same problem happened.

jiarong commented 10 months ago

This issue should gone after you remove ".virsorter" directory in your home directory.