RasmussenLab / vamb

Variational autoencoder for metagenomic binning
MIT License
254 stars 45 forks source link

Error during "creating z y v clusters from the final set of bins" avamb workflow #297

Open eperezv opened 7 months ago

eperezv commented 7 months ago

Hello,

I'm running avamb snakemake workflow on my dataset and everything worked perfect when I specified "min_comp": "0.9" and "max_cont": "0.05" on the config file. However, I repeated the process to specify "min_comp": "0.7" and "max_cont": "0.1" (from the scratch as I couldn't find a way to run the workflow after just changing that) and I am getting the following error.

I have found the same error with two different datasets:

creating z y v clusters from the final set of bins
/home/eduardo/tools/avamb/vamb/workflow_avamb/src/write_clusters_from_dereplicated_and_ripped_bins.sh: 22: [[: not found
/home/eduardo/tools/avamb/vamb/workflow_avamb/src/write_clusters_from_dereplicated_and_ripped_bins.sh: 22: [[: not found
/home/eduardo/tools/avamb/vamb/workflow_avamb/src/write_clusters_from_dereplicated_and_ripped_bins.sh: 22: [[: not found
/home/eduardo/tools/avamb/vamb/workflow_avamb/src/write_clusters_from_dereplicated_and_ripped_bins.sh: 22: [[: not found
/home/eduardo/tools/avamb/vamb/workflow_avamb/src/write_clusters_from_dereplicated_and_ripped_bins.sh: 22: [[: not found
/home/eduardo/tools/avamb/vamb/workflow_avamb/src/write_clusters_from_dereplicated_and_ripped_bins.sh: 22: [[: not found
/home/eduardo/tools/avamb/vamb/workflow_avamb/src/write_clusters_from_dereplicated_and_ripped_bins.sh: 22: [[: not found
/home/eduardo/tools/avamb/vamb/workflow_avamb/src/write_clusters_from_dereplicated_and_ripped_bins.sh: 22: [[: not found
Waiting at most 5 seconds for missing files.
MissingOutputException in rule write_clusters_from_nc_folders in file /home/eduardo/tools/avamb/vamb/workflow_avamb/avamb.snake.conda.smk, line 677:
Job 175  completed successfully, but some output files are missing. Missing files after 5 seconds. This might be due to filesystem latency. If that is the case, consider to increase the wait time with --latency-wait:
avamb_outdir_megahit_gpu_re/avamb/avamb_manual_drep_disjoint_clusters.tsv
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: .snakemake/log/2024-03-09T154306.573274.snakemake.log
jakobnissen commented 7 months ago

It is possible that this command was not run with the shell Bash, but with another kind of shell. @eperezv Do you know for sure that Snakemake executed using the Bash shell?

CC @Paupiera - perhaps this can be rewritten in Python? Since we know that the user certainly has a functioning Python install, but we do not necessarily know which shell they are using.

eperezv commented 7 months ago

Hi, thanks for your answer. Yes, it is Bash for sure (from the snakemake output):

Using shell:/usr/bin/bash
Paupiera commented 7 months ago

Hello,

I am sorry that you are experiencing problems. For a better interpretation of the error, could you share a tarball with the tmp/snakemake_tmp and the log folders?

eperezv commented 7 months ago

Here are the files. I tried to re-run it one time, not sure if that altered something log.tar.gz tmp/snakemake_tmp is too heavy for this, I uploaded to onedrive: https://1drv.ms/u/s!AhO14pDwo9Qjq5lJSgYN_iRSzBhzCQ?e=4Okf5y

Paupiera commented 7 months ago

Hi,

Everything looks normal regarding the workflow from the log and tmp. Could you try executing the command outside the snakemake workflow? This is the command that should be executed:

/home/eduardo/tools/avamb/vamb/workflow_avamb/src/write_clusters_from_dereplicated_and_ripped_bins.sh  -d avamb_outdir_megahit_gpu_re/Final_bins -o avamb_outdir_megahit_gpu_re/

Please let me know the error you get in case it does not run.

eperezv commented 7 months ago

Hi, thanks for checking it in detail. It does not give any error message.

Paupiera commented 7 months ago

Great. The script executed should have created the final clusters avamb_outdir_megahit_gpu_re/avamb/avamb_manual_drep_disjoint_clusters.tsv, which should match the bins contained in the avamb_outdir_megahit_gpu_re/Final_bins folder.

eperezv commented 7 months ago

I'm not sure. I have checked and I have a avamb_manual_drep_disjoint_clusters.tsv file under tmp, not under avamb folder. According to the file modification, the file was created before, not now...

Paupiera commented 7 months ago

That should not happen, since the:

/home/eduardo/tools/avamb/vamb/workflow_avamb/src/write_clusters_from_dereplicated_and_ripped_bins.sh  -d avamb_outdir_megahit_gpu_re/Final_bins -o avamb_outdir_megahit_gpu_re/

command should create the avamb_outdir_megahit_gpu_re/avamb/avamb_manual_drep_disjoint_clusters.tsv. Could you confirm that the content of avamb_manual_drep_disjoint_clusters.tsv matches the Final_bins fasta files? As start, I would check that all bins from Final_bins/ are present in the clusters file, and that the same number of contigs appear in both.

kellynntan commented 2 months ago

Hi,

Everything looks normal regarding the workflow from the log and tmp. Could you try executing the command outside the snakemake workflow? This is the command that should be executed:

/home/eduardo/tools/avamb/vamb/workflow_avamb/src/write_clusters_from_dereplicated_and_ripped_bins.sh  -d avamb_outdir_megahit_gpu_re/Final_bins -o avamb_outdir_megahit_gpu_re/

Please let me know the error you get in case it does not run.

Hi, I am having the same issue. I tried to run the above but was given the error

vamb/workflow_avamb/src/write_clusters_from_dereplicated_and_ripped_bins.sh: line 29: avamb_outdir/: Is a directory

Is there a possible solve?