Error while run: snakedeploy deploy-workflow https://github.com/simonlabcode/bam2bakR.git . --branch main - Githubissues

simonlabcode / bam2bakR

2 stars 0 forks source link

Error while run: snakedeploy deploy-workflow https://github.com/simonlabcode/bam2bakR.git . --branch main #4

Closed WangJingwen21 closed 1 year ago

WangJingwen21 commented 1 year ago

Thank you for your useful tools,I met an error while running: snakedeploy deploy-workflow https://github.com/simonlabcode/bam2bakR.git . --branch main

WangJingwen21 commented 1 year ago

Thank you for your useful tools,I met an error while running: snakedeploy deploy-workflow https://github.com/simonlabcode/bam2bakR.git . --branch main

Obtaining source repository... Traceback (most recent call last): File "/chenfeilab/Avocado/Softwares/miniconda3/envs/deploy_snakemake/bin/snakedeploy", line 10, in sys.exit(main()) ^^^^^^ File "/chenfeilab/Avocado/Softwares/miniconda3/envs/deploy_snakemake/lib/python3.11/site-packages/snakedeploy/client.py", line 237, in main deploy( File "/chenfeilab/Avocado/Softwares/miniconda3/envs/deploy_snakemake/lib/python3.11/site-packages/snakedeploy/deploy.py", line 136, in deploy sd.deploy(name=name, tag=tag, branch=branch) File "/chenfeilab/Avocado/Softwares/miniconda3/envs/deploy_snakemake/lib/python3.11/site-packages/snakedeploy/deploy.py", line 66, in deploy self.provider.clone(tmpdir) File "/chenfeilab/Avocado/Softwares/miniconda3/envs/deploy_snakemake/lib/python3.11/site-packages/snakedeploy/providers.py", line 85, in clone sp.run(["git", "clone", self.source_url, "."], cwd=tmpdir, check=True) File "/chenfeilab/Avocado/Softwares/miniconda3/envs/deploy_snakemake/lib/python3.11/subprocess.py", line 548, in run with Popen(*popenargs, **kwargs) as process: ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/chenfeilab/Avocado/Softwares/miniconda3/envs/deploy_snakemake/lib/python3.11/subprocess.py", line 1026, in init self._execute_child(args, executable, preexec_fn, close_fds, File "/chenfeilab/Avocado/Softwares/miniconda3/envs/deploy_snakemake/lib/python3.11/subprocess.py", line 1950, in _execute_child raise child_exception_type(errno_num, err_msg, err_filename) FileNotFoundError: [Errno 2] No such file or directory: 'git'

isaacvock commented 1 year ago

I apologize for the delay in getting back to you, I didn't see the notification about this issue report.

Do you have Git installed on the system that you are deploying the workflow to? If you don't, that could explain the problem.

WangJingwen21 commented 1 year ago

Thank you for your replication. Problem solved. However, another Error raised: [Thu Jun 15 13:55:39 2023] rule htseq_cnt: input: results/sf_reads/dTag3_ctl.s.sam output: results/htseq/dTag3_ctl_tl.bam, results/htseq/dTag3_ctl_check.txt log: logs/htseq_cnt/dTag3_ctl.log jobid: 11 reason: Missing output files: results/htseq/dTag3_ctl_tl.bam; Input files updated by another job: results/sf_reads/dTag3_ctl.s.sam wildcards: sample=dTag3_ctl threads: 160 resources: tmpdir=/tmp

Activating conda environment: .snakemake/conda/324d014bbc3294e9d150599b4c4bd479_ Waiting at most 5 seconds for missing files. MissingOutputException in rule htseq_cnt in file https://raw.githubusercontent.com/simonlabcode/bam2bakR/main/workflow/rules/bam2bakr.smk, line 86: Job 11 completed successfully, but some output files are missing. Missing files after 5 seconds. This might be due to filesystem latency. If that is the case, consider to increase the wait time with --latency-wait: results/htseq/dTag3_ctl_tl.bam Removing output files of failed job htseq_cnt since they might be corrupted: results/htseq/dTag3_ctl_check.txt Shutting down, this might take some time. Exiting because a job execution failed. Look above for error message Complete log: .snakemake/log/2023-06-15T133326.564062.snakemake.log

WangJingwen21 commented 1 year ago

Massages before this error: [Thu Jun 15 13:52:15 2023] Finished job 4. 11 of 34 steps (32%) done [Thu Jun 15 13:55:39 2023] Finished job 25. 12 of 34 steps (35%) done

WangJingwen21 commented 1 year ago

I got some files in the results folder:

isaacvock commented 1 year ago

I'm sorry you ran into another problem running the pipeline. Can you share the content of one of the .log files in the logs/htseq_cnt/ directory?

WangJingwen21 commented 1 year ago

Thank you, Isaac, There is one file in ‘logs/htseq_cnt’ directory：dTag3_ctl.log dTag3_ctl.log

isaacvock commented 1 year ago

Thank you for sending the log file. It looks like the problem has to do with installation of dependencies, though these particular errors are not ones I've ever run into so I apologize for not being able to quickly diagnose the source of the problem. If you could also provide the following information, that would be very helpful: 1) A log file from the logs/sort_filter/ directory to make sure that step didn't have equivalent dependency problems 2) Your config.yaml file 3) The code you ran to start the pipeline (I imagine it's something similar to snakemake --cores all --use-conda) 4) Any details about the system you are running the pipeline on (particularly the operating system and whether it's an hpc cluster or cloud computing resource) 5) If possible, a text file with all of the output that was printed to the terminal when you ran snakemake. I'm particularly interested in seeing if anything went wrong during dependency installation, i.e. the very first messages that were output when conda environments were being built.

I apologize for burdening you with providing all of this information, but it will all be immensely helpful in debugging the problem.

WangJingwen21 commented 1 year ago

Thank you so much. I hope this would help.

All the log files in logs/sort_filter directory are the same. One of them is logs/sort_filter/dTag3_ctl.log dTag3_ctl.log

2.I zipped my config.yaml file because GitHub does not except .yaml files. config.yaml.zip

My codes: snakemake --cores all --use-conda. I used codes from your documentation since I am not familiar with snakemeke.
I was running the pipeline on our hpc cluster. Our system version is Linux version 3.10.0-1160.49.1.el7.x86_64 (mockbuild@kbuilder.bsys.centos.org) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-44) (GCC) ) #1 SMP Tue Nov 30 15:51:32 UTC 2021. We didn't install the slarm system. Generally, it's an independent computer to me. If you need any particular information, I can ask our Server Engineer.
Massages after I run 'snakemake --cores all --use-conda'. printed.log

isaacvock commented 1 year ago

Thank you for getting me all of this information, it was incredibly helpful! It seems like there are previous reports of the same error that you have run into (see here for example). I just pushed a small change to the github that I think will fix your problems. You can run Snakemake exactly as you did before. The patch will split up the provided cores across multiple separate jobs (i.e., separate runs of the HTSeq script for each provided bam file), which will hopefully not only fix the problem but also speed up the pipeline for you.

The problem seems to be that you have access to an impressive amount of cores, but Python can run into problems when you try to parallelize it across too many cores. If the patch that I implemented does not fix the problem you ran into, I would suggest passing less cores to Snakemake, for example by running it with snakemake --cores 20 --use-conda rather than snakemake --cores all --use-conda

Let me know if none of the proposed solutions work for you and we can try something else.

isaacvock commented 1 year ago

Sorry, I initially pushed the change to the wrong branch, but the patch is now pushed to the main branch,

WangJingwen21 commented 1 year ago

Hi, Issac, Thank you so much for your help! It worked. 34 of 34 steps (100%) done.

isaacvock commented 1 year ago

Fantastic!