EMBL-PKU / BASALT

MIT License
76 stars 13 forks source link

BASALT stopped running #31

Open Songq-20 opened 2 months ago

Songq-20 commented 2 months ago

Dear BASALT Team: I submitted the task through Slurm. Slurm shows that the task is still running, but when I checked the server CPU usage, I found that the task was not actually running. This is my command:

#!/bin/bash
#SBATCH -J BASALT
#SBATCH -p cn-long
#SBATCH -N 1
#SBATCH -c 60
#SBATCH -o /data01nfs/user/songq/log/basalt.out
#SBATCH -e /data01nfs/user/songq/log/basalt.err
#SBATCH --no-requeue
#SBATCH -A cnl
a1=611WW_MV_1k.fa
a2=714WW_MV_1k.fa
a3=715WW_MV_1k.fa
s1_1=611WW.R1_trimmed.fq.gz
s1_2=611WW.R2_trimmed.fq.gz
s2_1=714WW.R1_trimmed.fq.gz
s2_2=714WW.R2_trimmed.fq.gz
s3_1=715WW.R1_trimmed.fq.gz
s3_2=715WW.R2_trimmed.fq.gz
export CHECKM2DB="/datanode03/huangxy/database/checkm_data/CheckM2_database/uniref100.KO.1.dmnd"
source activate BASALT
BASALT -a $a1,$a2,$a3 -s $s1_1,$s1_2/$s2_1,$s2_2/$s3_1,$s3_2 -t 20 -m 200 -qc checkm2

The Slurm:

    JOBID PARTITION     NAME    ST       TIME       NODES    NODELIST(REASON)
    15770   cn-long        BASALT   R   18-22:10:09      1        node03

How can I slove this problem? Thanks!

EMBL-PKU commented 1 month ago

Could you please read the Basalt_checkpoint.txt file and tell us where the software stopped? I may need to check the error occurred in which step. Thank you very much. You may also re-run the program. BASALT will keep running the incomplete job continuously.

Songq-20 commented 1 month ago

Thank you for your reply,the Basalt_checkpoint.txt :

1st autobinner done!
2nd bin selection within group done!
3rd bin selection within multiple groups done!
4th outlier removal done!
5th contig retrieve did not perform!
6th secondary de-repplication done!
7th Skip contig retrieve within group!  BestBinset_outlier_refined_filtrated_retrieved
1023011930 commented 1 month ago

我的运行似乎也在“7th Skip contig retrieve within group! BestBinset_outlier_refined_filtrated_retrieved”这一步停止 我的输入代码是 “#!/bin/bash source /home/zhongpei/miniconda3/bin/activate BASALT BASALT -a Unknown_CA010-001R0004.fastp_megahit_contigs.fa,Unknown_CA010-001R0005.fastp_megahit_contigs.fa,Unknown_CA010-001R0006.fastp_megahit_contigs.fa\ -s Unknown_CA010-001R0004.fastp.1.fq.gz,Unknown_CA010-001R0004.fastp.2.fq.gz/Unknown_CA010-001R0005.fastp.1.fq.gz,Unknown_CA010-001R0005.fastp.2.fq.gz/Unknown_CA010-001R0006.fastp.1.fq.gz,Unknown_CA010-001R0006.fastp.2.fq.gz\ -t 90 -m 350 -qc checkm --autopara sensitive --refinepara deep”

noddevil4949 commented 1 month ago

Hi both,

From the checkpoint file provided by Songq-20, it seems that BASALT stopped running at the gap filling step. Could you please check:

  1. BASALT main program is at 'S' or 'D' status but not running
  2. The existence of a folder named 'BestBinset_outlier_refined_filtrated_retrieved_OLC' or 'BestBinset_outlier_refined_filtrated_retrieved_retrieved_OLC'

If the above described status are true, please stop BASALT and re-run the program, BASALT will continue and finish the program. Alternatively, please download the latest BASALT (v1.0.2), as this bug has fixed in this version.

Thanks for using BASALT!