Snakemake-Profiles / lsf

Snakemake profile for running jobs on an LSF cluster
MIT License
36 stars 22 forks source link

job was done but no log was created and no bjobs show up when the job is running #26

Closed liutiming closed 4 years ago

liutiming commented 4 years ago

thinking it could be because not all params required by the bsub is in the profile e.g. -G. How can I provide those?

mbhall88 commented 4 years ago

Hi @liutiming I'll need a little more info in order to try and figure out what the issue it.

How did you submit your pipeline? Was there any errors/warning in the main snakemake job's log file?

When you set this profile up, one of the options that is configured is the default location for the cluster logs, which is logs/cluster by default. Quoting from the docs

This sets the directory under which cluster log files are written. The path is relative to the working directory of the pipeline. If it does not exist, it will be created.

Additionally, -G is not a required option by bsub (unless your admin has made it so), but any parameter that is not configured directly in the profile set up can be added in the cluster configuration file for the pipeline.

liutiming commented 4 years ago

is there a way for me to check where my customized log folder is? sorry didn't find it in the documentation!

mbhall88 commented 4 years ago

The location you gave to default_cluster_logdir can be found in ~/.config/snakemake/lsf/CookieCutter.py (provided you named your profile lsf) as the return value of the get_log_dir() method.

liutiming commented 4 years ago

hi @mbhall88 sorry for the late reply!

I submitted the job with snakemake --profile lsf -s extraction.smk , but I cannot find log/cluster under the directory iI run nor the home directory...

the CookieCutter.py looks like the following:

class CookieCutter:
    """
    Cookie Cutter wrapper
    """

    @staticmethod
    def get_default_threads() -> int:
        return int("1")

    @staticmethod
    def get_default_mem_mb() -> int:
        return int("30000")

    @staticmethod
    def get_log_dir() -> str:
        return "logs/cluster"

    @staticmethod
    def get_default_queue() -> str:
        return "normal"

    @staticmethod
    def get_lsf_unit_for_limits() -> str:
        return "MB"
mbhall88 commented 4 years ago

That's very strange. Can you please send me the log for your snakemake master process? And the output from ls -l on the directory you ran it from (and where the Snakefile is if it's in a different directory).

liutiming commented 4 years ago

Hi @mbhall88 I have replied via email. Thanks a lot!

liutiming commented 4 years ago

Hi @mbhall88 the gist for lsf.yaml is here :)

liutiming commented 4 years ago

I think snakemake 5.3 was installed because of this issue

I do not have admin rights to install mamba according to the current documentation. will it cause any bugs? trying to get my admin to install that for me

mbhall88 commented 4 years ago

Hi @mbhall88 the gist for lsf.yaml is here :)

There is an issue in line 3 of that YAML where double-quotes are used multiple times. Replace the internal quotes with single quotes ' instead so

- "-R "select[mem>30000] rusage[mem=30000]" -M 30000"
# becomes
- "-R 'select[mem>30000] rusage[mem=30000]' -M 30000"
liutiming commented 4 years ago

yup thanks I think it is working now! Job submitted and no error was shown in the log.

mbhall88 commented 4 years ago

Fantastic! Glad it's sorted now. Reopen if you do run into any more problems related to this issue.

liutiming commented 3 years ago

Hi @mbhall88 I went away for my exam preparation so I did not thoroughly tested the command but right now it has run into some strange issue. Basically when running a very small task it is taking forever and the log is here. Will really appreciate if you could take a look and should I create a new issue instead?

liutiming commented 3 years ago

I've checked that using the following script to submit a job is fine. so shouldn't be LSF issue?

script=$1
timestamp=`date +"%Y%m%d_%H%M"`

bsub -J $2 \
-o "$log_dir"/"$timestamp"_LSF_job_output.%J.log \
-e "$log_dir"/"$timestamp"_LSF_job_errorfile.%J.log \
-q normal \
-G team281  \
-R "select[mem>30000] rusage[mem=30000]" -M 30000 \
"sh $script"

but there is also this error file that i am not sure what it is. It wasn't created at the same time as the snakemake log though (about 10min before). Could have been created by some other attempts to use snakemake.

liutiming commented 3 years ago

Somehow it is working again this morning 😆 . Will update again if it is not working. Thanks!

mbhall88 commented 3 years ago

I think this has something to do with snakemake trying to resume incomplete jobs. If you see this again, try running snakemake --rerun-incomplete