ruppinlab / CSI-Microbes-identification

Code for the identification step for CSI-Microbes
Other
26 stars 7 forks source link

Pipeline Requires a Workload Manager #14

Closed DarioS closed 3 months ago

DarioS commented 9 months ago

It can't be run without one, but Mathematics and Statistics department has four Debian servers with 48 CPUs and 768 GB RAM each.

wir963 commented 9 months ago

Hey @DarioS ,

What do you mean by a workload manager? Do you mean a cluster submission server like SLURM? Our assumption was that most people would use CSI-Microbes with SLURM but you can run Snakemake in local mode by removing the --cluster-config and the --cluster options. Let me know if that works.

Best, Welles

DarioS commented 9 months ago

It is more ingrained than that.

$ ./scripts/run-build-PathSeq-microbes-files.sh
./scripts/run-build-PathSeq-microbes-files.sh: line 3: sbatch: command not found

After removing sbatch --time=4-00:00:00 --cpus-per-task=4 --mem=4g --partition=norm,ccr

scripts/run-snakemake.sh: line 3: conda: command not found
scripts/run-snakemake.sh: line 5: snakemake: command not found

So, there are quite a lot of dependencies to manually install.

$ lsb_release -a
No LSB modules are available.
Distributor ID: Debian
Description:    Debian GNU/Linux 12 (bookworm)
Release:        12
Codename:       bookworm

The only other server which I have access to is a cluster using PBS Professional. Have you evaluated implementation of Vulture?

wir963 commented 9 months ago

Hey @DarioS,

Yep, you are correct that there are several dependencies that have to be installed manually. You can find the entire list of dependencies in the README.

I hadn't seen Vulture before (it looks like it was just published last month) but it looks really cool and I'll look into it. If we were starting this project today, we would definitely use more Singularity containers and possibly nextFlow instead of Snakemake but I don't anticipate those features being officially supported in the near future unfortunately.

Best, Welles