Closed WangJingwen21 closed 1 year ago
Thank you for your useful tools,I met an error while running: snakedeploy deploy-workflow https://github.com/simonlabcode/bam2bakR.git . --branch main
Obtaining source repository...
Traceback (most recent call last):
File "/chenfeilab/Avocado/Softwares/miniconda3/envs/deploy_snakemake/bin/snakedeploy", line 10, in
I apologize for the delay in getting back to you, I didn't see the notification about this issue report.
Do you have Git installed on the system that you are deploying the workflow to? If you don't, that could explain the problem.
Thank you for your replication. Problem solved. However, another Error raised: [Thu Jun 15 13:55:39 2023] rule htseq_cnt: input: results/sf_reads/dTag3_ctl.s.sam output: results/htseq/dTag3_ctl_tl.bam, results/htseq/dTag3_ctl_check.txt log: logs/htseq_cnt/dTag3_ctl.log jobid: 11 reason: Missing output files: results/htseq/dTag3_ctl_tl.bam; Input files updated by another job: results/sf_reads/dTag3_ctl.s.sam wildcards: sample=dTag3_ctl threads: 160 resources: tmpdir=/tmp
Activating conda environment: .snakemake/conda/324d014bbc3294e9d150599b4c4bd479_ Waiting at most 5 seconds for missing files. MissingOutputException in rule htseq_cnt in file https://raw.githubusercontent.com/simonlabcode/bam2bakR/main/workflow/rules/bam2bakr.smk, line 86: Job 11 completed successfully, but some output files are missing. Missing files after 5 seconds. This might be due to filesystem latency. If that is the case, consider to increase the wait time with --latency-wait: results/htseq/dTag3_ctl_tl.bam Removing output files of failed job htseq_cnt since they might be corrupted: results/htseq/dTag3_ctl_check.txt Shutting down, this might take some time. Exiting because a job execution failed. Look above for error message Complete log: .snakemake/log/2023-06-15T133326.564062.snakemake.log
Massages before this error: [Thu Jun 15 13:52:15 2023] Finished job 4. 11 of 34 steps (32%) done [Thu Jun 15 13:55:39 2023] Finished job 25. 12 of 34 steps (35%) done
I got some files in the results folder:
I'm sorry you ran into another problem running the pipeline. Can you share the content of one of the .log files in the logs/htseq_cnt/ directory?
Thank you, Isaac, There is one file in ‘logs/htseq_cnt’ directory:dTag3_ctl.log dTag3_ctl.log
Thank you for sending the log file. It looks like the problem has to do with installation of dependencies, though these particular errors are not ones I've ever run into so I apologize for not being able to quickly diagnose the source of the problem. If you could also provide the following information, that would be very helpful:
1) A log file from the logs/sort_filter/ directory to make sure that step didn't have equivalent dependency problems
2) Your config.yaml file
3) The code you ran to start the pipeline (I imagine it's something similar to snakemake --cores all --use-conda
)
4) Any details about the system you are running the pipeline on (particularly the operating system and whether it's an hpc cluster or cloud computing resource)
5) If possible, a text file with all of the output that was printed to the terminal when you ran snakemake. I'm particularly interested in seeing if anything went wrong during dependency installation, i.e. the very first messages that were output when conda environments were being built.
I apologize for burdening you with providing all of this information, but it will all be immensely helpful in debugging the problem.
Thank you so much. I hope this would help.
2.I zipped my config.yaml file because GitHub does not except .yaml files. config.yaml.zip
My codes: snakemake --cores all --use-conda. I used codes from your documentation since I am not familiar with snakemeke.
I was running the pipeline on our hpc cluster. Our system version is Linux version 3.10.0-1160.49.1.el7.x86_64 (mockbuild@kbuilder.bsys.centos.org) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-44) (GCC) ) #1 SMP Tue Nov 30 15:51:32 UTC 2021. We didn't install the slarm system. Generally, it's an independent computer to me. If you need any particular information, I can ask our Server Engineer.
Massages after I run 'snakemake --cores all --use-conda'. printed.log
Thank you for getting me all of this information, it was incredibly helpful! It seems like there are previous reports of the same error that you have run into (see here for example). I just pushed a small change to the github that I think will fix your problems. You can run Snakemake exactly as you did before. The patch will split up the provided cores across multiple separate jobs (i.e., separate runs of the HTSeq script for each provided bam file), which will hopefully not only fix the problem but also speed up the pipeline for you.
The problem seems to be that you have access to an impressive amount of cores, but Python can run into problems when you try to parallelize it across too many cores. If the patch that I implemented does not fix the problem you ran into, I would suggest passing less cores to Snakemake, for example by running it with snakemake --cores 20 --use-conda
rather than snakemake --cores all --use-conda
Let me know if none of the proposed solutions work for you and we can try something else.
Sorry, I initially pushed the change to the wrong branch, but the patch is now pushed to the main branch,
Hi, Issac, Thank you so much for your help! It worked. 34 of 34 steps (100%) done.
Fantastic!
Thank you for your useful tools,I met an error while running: snakedeploy deploy-workflow https://github.com/simonlabcode/bam2bakR.git . --branch main