moiexpositoalonsolab / grenepipe

A flexible, scalable, and reproducible pipeline to automate variant calling from raw sequence reads, with lots of bells and whistles.
http://grene-net.org
GNU General Public License v3.0
93 stars 21 forks source link

MissingRuleException #31

Closed mrese001 closed 1 year ago

mrese001 commented 1 year ago

Hello Lucas,

I am a bioinformatics beginner so I apologize in advance if my question is a trivial one: I am getting this error message once I try to run the pipeline:

Building DAG of jobs... MissingRuleException: No rule to produce example (if you use input functions make sure that they don't raise unexpected exceptions).

What does this mean exactly and how can I possibly fix it?

Thank you! Mariano

lczech commented 1 year ago

Hi Mariano,

what is the exact command that you ran there? It looks like you are missing something there, such as specifying the --directory option? See here and here for more information on that :-)

Cheers Lucas

mrese001 commented 1 year ago

Hey Lucas!

Apologies for the late reply but I wanted to make sure that I come to you with a more informed question. With regard to my last question I had the support team here at the computing cluster department help me and I just needed some more packages to upload so I could initialize the environment correctly. Once we initialized the environment we ran the Arabidopsis example and it completed without any issues - good news there!

Now that I have the gist of how the files are organized I am attempting to run two fastq.gz files against a reference. I attempted to run your script to create the samples.tsv file but I get this error message: [image: image.png]

Is there anything inside of the generate-table.py script that I should change to properly point grenepipe to the proper directory so it can properly develop the table? Is there some way to generate the table manually?

Below is how I setup my table but what I need to change in order to run grenepipe is the 5th column where it's pointing to the example directory. Since I am running these two citrus trees through the pipeline would I even need the 5th column since they are single-end reads? @A01535:218:HLYGKDSX5:1:1101:24758:1000 1:N:0:GACGAGATTA+AGGATAATGT [image: image.png]

Many thanks for taking the time to read this and for your advice,

Mariano

On Fri, Apr 7, 2023 at 9:49 AM Lucas Czech @.***> wrote:

Hi Mariana,

what is the exact command that you ran there? It looks like you are missing something there, such as specifying the --directory option? See here https://github.com/moiexpositoalonsolab/grenepipe/wiki/Quick-Start-and-Full-Example#running-the-pipeline and here https://github.com/moiexpositoalonsolab/grenepipe/wiki/Advanced-Usage#working-directory for more information on that :-)

Cheers Lucas

— Reply to this email directly, view it on GitHub https://github.com/moiexpositoalonsolab/grenepipe/issues/31#issuecomment-1500455815, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACE2ULM7VIAN3UUZPYZZN2TXABAR3ANCNFSM6AAAAAAWWWCJFQ . You are receiving this because you authored the thread.Message ID: @.***>

lczech commented 1 year ago

Hey Mariano,

the image that you tried to attach to your message did not get posted here, see above. Can you please post it again here on GitHub directly, instead of answering the thread via email?

I've also seen the email that you send me, which seems to relate to a related issue?! Please use the GitHub issues here for asking these questions, as that will help others with similar issues to find solutions as well.

Assuming that both your issue here and your email are about the same problem, I'll answer here.

Is there anything inside of the generate-table.py script that I should change to properly point grenepipe to the proper directory so it can properly develop the table? Is there some way to generate the table manually?

All of this is explained in the wiki: https://github.com/moiexpositoalonsolab/grenepipe/wiki/Setup-and-Usage#samples-table The script accepts the path to where it should look for fastq files, so you don't need to edit it to point to the directory. You can of course also create the table manually, as explained in the wiki as well.

Below is how I setup my table but what I need to change in order to run grenepipe is the 5th column where it's pointing to the example directory. Since I am running these two citrus trees through the pipeline would I even need the 5th column since they are single-end reads?

The column fq2 should always be present in the table (I might relax that in the future, but it does not seem to be an urgent fix), but the fields can be left empty if there is no paired end read. If you can share your table here, I can also have a look at what's wrong with it.

And from your email:

.. but now that I am doing my own samples, it gives me errors, which I will share here. The errors, I think, stem from the way I am organizing my directory to run the grenepipe but I would appreciate some clarity.

The error message of which you send me a picture in that email contains the hint that you are looking for:

Expected 3 fields in line 3, saw 4

Without having access to your samples table, I can't tell for sure, but my guess is that your number of items per row is not consistent, and that you for example forgot a column header, or have tab characters somewhere where they should not be. Please check that your table is following the schematics.

Hope that helps, and so long Lucas

mrese001 commented 1 year ago

Hey Lucas!

Again, apologies for the late reply - I've been troubleshooting but did not realize an important comment that you have made. You were correct in your evaluation of the last error message and that so far is my main issue. I for some reason had some spaces placed in the header and file fields and not tabs as required. The workflow is processing and will update here if I run into any issues.

Many thanks for your patience! Mariano

mrese001 commented 1 year ago

Hey Lucas,

Although the workflow did initialize and create some files (trimmed, logs, qc, benchmarks) at some point the job failed when it was attempting to map reads. Here is the error of the workflow:

Error in rule map_reads: jobid: 22 output: mapped/S1-1.sorted.bam, mapped/S1-1.sorted.done log: logs/bwa-mem/S1-1.log (check log file(s) for error message)

RuleException: CalledProcessError in line 58 of /bigdata/seymourlab/mrese001/grenepipe-0.12.0/rules/mapping-bwa-mem.smk: Command 'set -euo pipefail; /rhome/mrese001/.conda/envs/grenepipe/bin/python3.7 /bigdata/seymourlab/mrese001/grenepipe-0.12.0/my_citrus_analysis2/.snakemake/scripts/tmpwgqzrpbu.wrapper.py' returned non-zero exit status 1. File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/executors/init.py", line 2293, in run_wrapper File "/bigdata/seymourlab/mrese001/grenepipe-0.12.0/rules/mapping-bwa-mem.smk", line 58, in rule_map_reads File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/executors/init__.py", line 568, in _callback File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/concurrent/futures/thread.py", line 57, in run File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/executors/init.py", line 554, in cached_or_run File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/executors/init.py", line 2359, in run_wrapper Shutting down, this might take some time. Exiting because a job execution failed. Look above for error message Complete log: /bigdata/seymourlab/mrese001/grenepipe-0.12.0/my_citrus_analysis2/.snakemake/log/2023-05-10T113431.107190.snakemake.log

Here are the details of the error message inside log/bwa: [M::bwa_idx_load_from_disk] read 0 ALT contigs [M::process] read 274716 sequences (40000193 bp)... [W::sam_hdr_create] Ignored @SQ SN:scaffold_2 : bad or missing LN tag [E::sam_hrecs_update_hashes] Header includes @SQ line "scaffold_2" with no LN: tag [E::sam_hrecs_update_hashes] Header includes @SQ line "scaffold_2" with no LN: tag samtools sort: failed to change sort order header to 'SO:coordinate'

Many thanks as always, Mariano

lczech commented 1 year ago

Hi Mariano,

ha! Progress! Going from error to error until it's fixed :-)

So, this is a new problem that I have not seen before. It's happening in an internal step that I did not even code myself :-D It might be a bit tricky to figure out where this is coming from. Let's see:

If neither of that helps or fixes it: How large is your dataset? If you could send me the ref genome, and one sample for which the error occurs, along with the config file that you are using, I can try to re-create the issue for further debugging.

Cheers and so long Lucas

mrese001 commented 1 year ago

Hi Lucas,

Thanks for your response, you are correct this error was generated with my own dataset. In previous runs of the workflow I always added --use-conda but was advised by our cluster admin to omit this. So I have been running the workflow like this: snakemake --cores 4 --verbose --directory . When I do add --use-conda I get this error:

CreateCondaEnvironmentException: Could not create conda environment from /bigdata/seymourlab/mrese001/grenepipe-0.12.0/rules/../envs/picard.yaml: Collecting package metadata (repodata.json): ...working... File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/deployment/conda.py", line 389, in create

I am currently module loading samtools 1.17. As for the index files ... I do have .fai files in the same directory that the .yaml file is pulling the references from. Do I need to do anything more apart from keeping them in the same directory? Apologies if I missed this from the Wiki. Since I'm running this on a MacBook I'm looking for the Activity Monitor GUI correct? Or did you mean within the terminal? I am running this through a cluster environment.

I will send you the files you've asked for via email. Thanks again!

Mariano

lczech commented 1 year ago

Hi Mariano,

Thanks for your response, you are correct this error was generated with my own dataset

Ah okay, does that mean you managed to run the example now?

In previous runs of the workflow I always added --use-conda but was advised by our cluster admin to omit this.

Well, in that case, you'd have to set up every tool that is used in grenepipe on your own, ensuring that all of them are in the correct versions that we need... That seems like a rather large overhead, and I don't understand why your cluster admin advises to do that. Using conda is a bit more complicated on a cluster, because likely you will have to set it up locally on your own (as explained in the grenepipe wiki) - but likely still way easier than trying to get everything to run with the cluster module system. Up to you though. However, in that case, I won't be able to help you very much, as that would entail a lot of debugging on the cluster, I assume... There is a reason that grenepipe uses conda: dependencies are difficult, and conda makes that at least a bit easier (once you get past the initial trouble of getting conda to work). So, without conda, you'd have to figure this out with your cluster admin on your own then. I'd hence highly advice against this, and instead try to get conda/mamba to work.

So I have been running the workflow like this: snakemake --cores 4 --verbose --directory

I'd recommend to use mamba for the package management, by adding --conda-frontend mamba to the call. That requires that you have mamba set up, as also explained in the grenepipe wiki.

When I do add --use-conda I get this error: CreateCondaEnvironmentException

Yes, that happened with conda to me before, for that particular environment. This was hopefully solved in grenepipe v0.12.0 though. Are you using this or a later version? If not, please download a more recent version of grenepipe, v0.12.0 or later! If this still does not work, that would be strange, but could happen. Let me know. Also, using mamba has solved the issue before, so if you are going to use mamba anyway, I guess that this will be solved as well.

I am currently module loading samtools 1.17.

That's different from what grenepipe uses, but might still work. You'd need to test if this is the cause if the issue. If so, it will also be solved when you use conda/mamba to get the versions that grenepipe uses internally.

As for the index files ... I do have .fai files in the same directory that the .yaml file is pulling the references from. Do I need to do anything more apart from keeping them in the same directory?

Nope, that should be it. I'd recommend to let grenepipe create them though, in case they were created before with some incompatible tool. So, to not delete any of your existing files, you could copy the ref genome fasta file (and just that one) to a new directory, and use that in the config.yaml instead. grenepipe will then create all index files it needs automatically in that directory as well.

Since I'm running this on a MacBook I'm looking for the Activity Monitor GUI correct? Or did you mean within the terminal? I am running this through a cluster environment.

I am not sure that I understand. You are running grenepipe on a MacBook, but that is within a cluster environment? So, your cluster nodes are Macs? With the command that you gave above (snakemake --cores 4 --verbose --directory ...), it does not look like this is a cluster - you are missing the --profile. Please see the cluster page for details on that.

If you intend to run grenepipe locally on your own MacBook, you can use conda/mamba more easily, but it won't scale to large datasets. If you are using a cluster, I'd be surprised to see that you have a cluster of Macs. But in that case, you'd typically want to follow the wiki on how to use --profile, or at least run grenepipe in a tmux or screen session. Or did you mean to say that you use your Mac to connect to the cluster, and run things there? Again in that case you want to use --profile to actually make use of the cluster (otherwise, you'd be running the whole pipeline on the login node), as well as follow the other steps as explained in the cluster wiki. Please clarify ;-)

Either way, as for the system monitor: That needs to be run on the computer that is actually executing the step that is causing trouble. If this is locally on your Mac, then yes, the Activity Monitor GUI seems to be the right tool. If you are running this on a cluster, it's more difficult, as you usually will have to ssh into the node where things are running, which you'd need to figure out first while it's running. I recommend to ask your cluster admin for help on that. Alternatively, you can use the cluster log files to debug afterwards (once the crash has happened), as explained in the cluster wiki as well - there is a Troubleshooting example that specifically addresses the out-of-memory problem. Following this should give you an idea where the error is coming from.

Hope that helps, so long Lucas

mrese001 commented 1 year ago

Hi Lucas!

Yes, I was indeed able to run the Arabidopsis example and since the last time we discussed I solved the above issue with my cluster admin and put some citrus genomes through Grenepipe! However, now that I am doing the process for paired-end reads I am receiving a different error. But before I get too ahead of myself let me clear up some points discussed before: I am now using --use-conda and --conda-frontend mamba in my troubleshooting and am also using grenepipe v12.0. I did indeed mean to say that I use my Mac to connect to the cluster and run things there sorry for being unclear.

Thank you for your suggestions and for pointing out that in order to use a bash job submit script I need a --profile. I did not have this at all as you've caught and this might be why I'm running into errors.

At the moment I am troubleshooting using tmux in an interactive node with your suggestions while I also troubleshoot with the bash script now with a profile appended. Will I need to specify a --profile when running the snakemake interactively?

When I ran it this time I opened a tmux session, got into an interactive cluster using srun:

srun -p seymourlab --mem 50G -N 1 -n 16 --time 10-00:00:00 --pty bash -l
then I conda activate grenepipe and module load these tools: 
module load samtools
module load trimmomatic
module load fastqc
module load bwa
module load picard
module load gatk/4.3.0.0
module load bcftools
module load seqkit

and lastly I run:

snakemake --cores 16 --conda-frontend mamba --verbose --directory /rhome/mrese001/bigdata/GRENEPIPE/parents_draft

I have my entire error output here:

Errors ``` Date: 2023-07-11 14:40:40 Platform: Linux-4.18.0-348.12.2.el8_5.x86_64-x86_64-with-centos-8.5-Green_Obsidian #1 SMP Wed Jan 19 17:53:40 UTC 2022 Host: i55 User: mrese001 Conda: 22.9.0 Python: 3.7.10 Snakemake: 6.0.5 Grenepipe: 0.12.0 Conda env: grenepipe (/rhome/mrese001/.conda/envs/grenepipe) Command: /rhome/mrese001/.conda/envs/grenepipe/bin/snakemake --cores 16 --conda-frontend mamba --verbose --directory /rhome/mrese001/bigdata/GRENEPIPE/parents_draft Base directory: /bigdata/seymourlab/mrese001/GRENEPIPE Working directory: /bigdata/seymourlab/mrese001/GRENEPIPE/parents_draft Config file(s): /bigdata/seymourlab/mrese001/GRENEPIPE/parents_draft/config.yaml Samples: 5 Building DAG of jobs... Updating job merge_variants. Replace merge_variants with dynamic branch merge_variants updating depending job select_calls updating depending job select_calls updating depending job all Updating job map_reads. Replace map_reads with dynamic branch map_reads updating depending job merge_sample_unit_bams Updating job map_reads. Replace map_reads with dynamic branch map_reads updating depending job merge_sample_unit_bams Updating job map_reads. Replace map_reads with dynamic branch map_reads updating depending job merge_sample_unit_bams Updating job map_reads. Replace map_reads with dynamic branch map_reads updating depending job merge_sample_unit_bams Updating job map_reads. Replace map_reads with dynamic branch map_reads updating depending job merge_sample_unit_bams Using shell: /usr/bin/bash Provided cores: 16 Rules claiming more threads will be scaled down. Conda environments: ignored Job counts: count jobs 1 all 5 bam_index 1 bcftools_stats 1 bcftools_stats_plot 6990 call_variants 1398 combine_calls 1 dedup_reports_collect 1 fastqc 1 fastqc_collect 2 gatk_hard_filter_calls 1398 genotype_variants 5 map_reads 5 mark_duplicates 1 merge_calls 5 merge_sample_unit_bams 1 merge_variants 1 multiqc 5 picard_collectmultiplemetrics 1 picard_collectmultiplemetrics_collect 1 qualimap_collect 5 qualimap_sample 5 samtools_flagstat 1 samtools_flagstat_collect 5 samtools_stats 1 samtools_stats_collect 2 select_calls 5 trim_reads_pe 1 trimming_reports_collect 5 trimmomatic_multiqc_log 9854 Resources before job selection: {'_cores': 16, '_nodes': 9223372036854775807} Ready jobs (6): trim_reads_pe trim_reads_pe trim_reads_pe fastqc trim_reads_pe trim_reads_pe Select jobs to execute... Selected jobs (3): fastqc trim_reads_pe trim_reads_pe Resources after job selection: {'_cores': 3, '_nodes': 9223372036854775804} [Tue Jul 11 14:40:54 2023] rule fastqc: input: /rhome/mrese001/bigdata/GRENEPIPE/Parents_fastq/2023-03-21_FL_parents/Data/e23vf658w3/Un_DTSA720/Project_DSRP_NIFA_PARENTS/NIFA_35_S39_L004_R1_001.fastq.gz output: qc/fastqc/S1-1-R1_fastqc.html, qc/fastqc/S1-1-R1_fastqc.zip log: logs/fastqc/S1-1-R1.log jobid: 11 benchmark: benchmarks/fastqc/S1-1-R1.bench.log wildcards: sample=S1, unit=1, id=R1 [Tue Jul 11 14:40:54 2023] rule trim_reads_pe: input: /rhome/mrese001/bigdata/GRENEPIPE/Parents_fastq/2023-03-21_FL_parents/Data/e23vf658w3/Un_DTSA720/Project_DSRP_NIFA_PARENTS/NIFA_P01_S1_L004_R1_001.fastq.gz, /rhome/mrese001/b$ output: trimmed/S3-3.1.fastq.gz, trimmed/S3-3.2.fastq.gz, trimmed/S3-3.1.unpaired.fastq.gz, trimmed/S3-3.2.unpaired.fastq.gz, trimmed/S3-3-pe.done log: logs/trimmomatic/S3-3.log jobid: 27 benchmark: benchmarks/trimmomatic/S3-3.bench.log wildcards: sample=S3, unit=3 threads: 6 [Tue Jul 11 14:40:54 2023] rule trim_reads_pe: input: /rhome/mrese001/bigdata/GRENEPIPE/Parents_fastq/2023-03-21_FL_parents/Data/e23vf658w3/Un_DTSA720/Project_DSRP_NIFA_PARENTS/NIFA_35_S39_L004_R1_001.fastq.gz, /rhome/mrese001/b$ output: trimmed/S1-1.1.fastq.gz, trimmed/S1-1.2.fastq.gz, trimmed/S1-1.1.unpaired.fastq.gz, trimmed/S1-1.2.unpaired.fastq.gz, trimmed/S1-1-pe.done log: logs/trimmomatic/S1-1.log jobid: 23 benchmark: benchmarks/trimmomatic/S1-1.bench.log wildcards: sample=S1, unit=1 threads: 6 Full Traceback (most recent call last): File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 2293, in run_wrapper edit_notebook, File "/bigdata/seymourlab/mrese001/GRENEPIPE/rules/trimming-trimmomatic.smk", line 117, in __rule_trim_reads_pe raise Exception( File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/wrapper.py", line 144, in wrapper shadow_dir, File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/script.py", line 938, in script executor.evaluate() File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/script.py", line 313, in evaluate self.execute_script(fd.name, edit=edit) File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/script.py", line 506, in execute_script self._execute_cmd("{py_exec} {fname:q}", py_exec=py_exec, fname=fname) File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/script.py", line 354, in _execute_cmd **kwargs File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/shell.py", line 231, in __new__ raise sp.CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command 'set -euo pipefail; /rhome/mrese001/.conda/envs/grenepipe/bin/python3.7 /bigdata/seymourlab/mrese001/GRENEPIPE/parents_draft/.snakemake/scripts/t$ [Tue Jul 11 14:40:56 2023] Error in rule trim_reads_pe: jobid: 23 output: trimmed/S1-1.1.fastq.gz, trimmed/S1-1.2.fastq.gz, trimmed/S1-1.1.unpaired.fastq.gz, trimmed/S1-1.2.unpaired.fastq.gz, trimmed/S1-1-pe.done log: logs/trimmomatic/S1-1.log (check log file(s) for error message) Full Traceback (most recent call last): File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 2293, in run_wrapper edit_notebook, File "/bigdata/seymourlab/mrese001/GRENEPIPE/rules/trimming-trimmomatic.smk", line 117, in __rule_trim_reads_pe raise Exception( File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/wrapper.py", line 144, in wrapper shadow_dir, File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/script.py", line 938, in script executor.evaluate() File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/script.py", line 313, in evaluate self.execute_script(fd.name, edit=edit) File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/script.py", line 506, in execute_script self._execute_cmd("{py_exec} {fname:q}", py_exec=py_exec, fname=fname) File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/script.py", line 354, in _execute_cmd **kwargs File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/shell.py", line 231, in __new__ raise sp.CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command 'set -euo pipefail; /rhome/mrese001/.conda/envs/grenepipe/bin/python3.7 /bigdata/seymourlab/mrese001/GRENEPIPE/parents_draft/.snakemake/scripts/t$ During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 568, in _callback raise ex File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/concurrent/futures/thread.py", line 57, in run result = self.fn(*self.args, **self.kwargs) File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 554, in cached_or_run run_func(*args) File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 2359, in run_wrapper ex, lineno, linemaps=linemaps, snakefile=file, show_traceback=True snakemake.exceptions.RuleException: CalledProcessError in line 72 of /bigdata/seymourlab/mrese001/GRENEPIPE/rules/trimming-trimmomatic.smk: Command 'set -euo pipefail; /rhome/mrese001/.conda/envs/grenepipe/bin/python3.7 /bigdata/seymourlab/mrese001/GRENEPIPE/parents_draft/.snakemake/scripts/tmpuw1eg9jj.wrapper.py' returned$ File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 2293, in run_wrapper File "/bigdata/seymourlab/mrese001/GRENEPIPE/rules/trimming-trimmomatic.smk", line 72, in __rule_trim_reads_pe RuleException: CalledProcessError in line 72 of /bigdata/seymourlab/mrese001/GRENEPIPE/rules/trimming-trimmomatic.smk: Command 'set -euo pipefail; /rhome/mrese001/.conda/envs/grenepipe/bin/python3.7 /bigdata/seymourlab/mrese001/GRENEPIPE/parents_draft/.snakemake/scripts/tmpuw1eg9jj.wrapper.py' returned$ File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 2293, in run_wrapper File "/bigdata/seymourlab/mrese001/GRENEPIPE/rules/trimming-trimmomatic.smk", line 72, in __rule_trim_reads_pe File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 568, in _callback File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/concurrent/futures/thread.py", line 57, in run File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 554, in cached_or_run File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 2359, in run_wrapper Full Traceback (most recent call last): File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 2293, in run_wrapper edit_notebook, File "/bigdata/seymourlab/mrese001/GRENEPIPE/rules/trimming-trimmomatic.smk", line 117, in __rule_trim_reads_pe raise Exception( File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/wrapper.py", line 144, in wrapper shadow_dir, File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/script.py", line 938, in script executor.evaluate() File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/script.py", line 313, in evaluate self.execute_script(fd.name, edit=edit) File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/script.py", line 506, in execute_script self._execute_cmd("{py_exec} {fname:q}", py_exec=py_exec, fname=fname) File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/script.py", line 354, in _execute_cmd **kwargs File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/shell.py", line 231, in __new__ raise sp.CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command 'set -euo pipefail; /rhome/mrese001/.conda/envs/grenepipe/bin/python3.7 /bigdata/seymourlab/mrese001/GRENEPIPE/parents_draft/.snakemake/scripts/t$ [Tue Jul 11 14:40:56 2023] Error in rule trim_reads_pe: jobid: 27 output: trimmed/S3-3.1.fastq.gz, trimmed/S3-3.2.fastq.gz, trimmed/S3-3.1.unpaired.fastq.gz, trimmed/S3-3.2.unpaired.fastq.gz, trimmed/S3-3-pe.done log: logs/trimmomatic/S3-3.log (check log file(s) for error message) Full Traceback (most recent call last): File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 2293, in run_wrapper edit_notebook, File "/bigdata/seymourlab/mrese001/GRENEPIPE/rules/trimming-trimmomatic.smk", line 117, in __rule_trim_reads_pe raise Exception( File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/wrapper.py", line 144, in wrapper shadow_dir, File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/script.py", line 938, in script executor.evaluate() File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/script.py", line 313, in evaluate self.execute_script(fd.name, edit=edit) File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/script.py", line 506, in execute_script self._execute_cmd("{py_exec} {fname:q}", py_exec=py_exec, fname=fname) File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/script.py", line 354, in _execute_cmd **kwargs File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/shell.py", line 231, in __new__ raise sp.CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command 'set -euo pipefail; /rhome/mrese001/.conda/envs/grenepipe/bin/python3.7 /bigdata/seymourlab/mrese001/GRENEPIPE/parents_draft/.snakemake/scripts/t$ During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 568, in _callback raise ex File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/concurrent/futures/thread.py", line 57, in run result = self.fn(*self.args, **self.kwargs) File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 554, in cached_or_run run_func(*args) File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 2359, in run_wrapper ex, lineno, linemaps=linemaps, snakefile=file, show_traceback=True snakemake.exceptions.RuleException: CalledProcessError in line 72 of /bigdata/seymourlab/mrese001/GRENEPIPE/rules/trimming-trimmomatic.smk: Command 'set -euo pipefail; /rhome/mrese001/.conda/envs/grenepipe/bin/python3.7 /bigdata/seymourlab/mrese001/GRENEPIPE/parents_draft/.snakemake/scripts/tmppa668ctn.wrapper.py' returned$ File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 2293, in run_wrapper File "/bigdata/seymourlab/mrese001/GRENEPIPE/rules/trimming-trimmomatic.smk", line 72, in __rule_trim_reads_pe RuleException: CalledProcessError in line 72 of /bigdata/seymourlab/mrese001/GRENEPIPE/rules/trimming-trimmomatic.smk: Command 'set -euo pipefail; /rhome/mrese001/.conda/envs/grenepipe/bin/python3.7 /bigdata/seymourlab/mrese001/GRENEPIPE/parents_draft/.snakemake/scripts/tmppa668ctn.wrapper.py' returned$ File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 2293, in run_wrapper File "/bigdata/seymourlab/mrese001/GRENEPIPE/rules/trimming-trimmomatic.smk", line 72, in __rule_trim_reads_pe File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 568, in _callback File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/concurrent/futures/thread.py", line 57, in run File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 554, in cached_or_run File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 2359, in run_wrapper Terminating processes on user request, this might take some time. Full Traceback (most recent call last): File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 2293, in run_wrapper edit_notebook, File "/bigdata/seymourlab/mrese001/GRENEPIPE/rules/qc-fastq.smk", line 112, in __rule_fastqc File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/script.py", line 938, in script executor.evaluate() File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/script.py", line 313, in evaluate self.execute_script(fd.name, edit=edit) File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/script.py", line 506, in execute_script self._execute_cmd("{py_exec} {fname:q}", py_exec=py_exec, fname=fname) File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/script.py", line 354, in _execute_cmd **kwargs File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/shell.py", line 231, in __new__ raise sp.CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command 'set -euo pipefail; /rhome/mrese001/.conda/envs/grenepipe/bin/python3.7 /bigdata/seymourlab/mrese001/GRENEPIPE/parents_draft/.snakemake/scripts/t$ [Tue Jul 11 14:42:25 2023] Error in rule fastqc: jobid: 11 output: qc/fastqc/S1-1-R1_fastqc.html, qc/fastqc/S1-1-R1_fastqc.zip log: logs/fastqc/S1-1-R1.log (check log file(s) for error message) Full Traceback (most recent call last): File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 2293, in run_wrapper edit_notebook, File "/bigdata/seymourlab/mrese001/GRENEPIPE/rules/qc-fastq.smk", line 112, in __rule_fastqc File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/script.py", line 938, in script executor.evaluate() File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/script.py", line 313, in evaluate self.execute_script(fd.name, edit=edit) File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/script.py", line 506, in execute_script self._execute_cmd("{py_exec} {fname:q}", py_exec=py_exec, fname=fname) File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/script.py", line 354, in _execute_cmd **kwargs File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/shell.py", line 231, in __new__ raise sp.CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command 'set -euo pipefail; /rhome/mrese001/.conda/envs/grenepipe/bin/python3.7 /bigdata/seymourlab/mrese001/GRENEPIPE/parents_draft/.snakemake/scripts/t$ During handling of the above exception, another exception occurred: During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 568, in _callback raise ex File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/concurrent/futures/thread.py", line 57, in run result = self.fn(*self.args, **self.kwargs) File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 554, in cached_or_run run_func(*args) File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 2359, in run_wrapper ex, lineno, linemaps=linemaps, snakefile=file, show_traceback=True snakemake.exceptions.RuleException: CalledProcessError in line 90 of /bigdata/seymourlab/mrese001/GRENEPIPE/rules/qc-fastq.smk: Command 'set -euo pipefail; /rhome/mrese001/.conda/envs/grenepipe/bin/python3.7 /bigdata/seymourlab/mrese001/GRENEPIPE/parents_draft/.snakemake/scripts/tmpb_94f3g4.fastqc.py' returned $ File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 2293, in run_wrapper File "/bigdata/seymourlab/mrese001/GRENEPIPE/rules/qc-fastq.smk", line 90, in __rule_fastqc RuleException: CalledProcessError in line 90 of /bigdata/seymourlab/mrese001/GRENEPIPE/rules/qc-fastq.smk: Command 'set -euo pipefail; /rhome/mrese001/.conda/envs/grenepipe/bin/python3.7 /bigdata/seymourlab/mrese001/GRENEPIPE/parents_draft/.snakemake/scripts/tmpb_94f3g4.fastqc.py' returned $ File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 2293, in run_wrapper File "/bigdata/seymourlab/mrese001/GRENEPIPE/rules/qc-fastq.smk", line 90, in __rule_fastqc File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 568, in _callback File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/concurrent/futures/thread.py", line 57, in run File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 554, in cached_or_run File "/rhome/mrese001/.conda/envs/grenepipe/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 2359, in run_wrapper Complete log: /bigdata/seymourlab/mrese001/GRENEPIPE/parents_draft/.snakemake/log/2023-07-11T144039.150457.snakemake.log ```
lczech commented 1 year ago

Hi @mrese001,

thanks for the clarifications!

Will I need to specify a --profile when running the snakemake interactively?

That depends on what you want to do/achieve here. With --profile, snakemake will submit steps of the grenepipe analysis as jobs to the cluster. Without, it will run it "locally", which in your case of an interactive session means, running it on the node where that session lives. If you just want to test the general setup with a small example or test dataset, the latter is a good way to do so. However, for big runs, where you actually want to leverage the power of the cluster, of course you want to submit steps as jobs, so that more compute nodes can parallelize the work.

As for the issues you are observing now: it is unclear what exactly is happening from what you posted. It seems you are not using --profile here, meaning that there are no slurm log files to check. It could be that your interactive session simply does not have enough memory for certain jobs. I recommend checking the troubleshooting for some more info on that. However, as you are running it "locally" on your interactive node, you will have to check via some other ways whether there is enough memory, for example by opening a second terminal on that exact same node, and running htop there. Please also ask your admins for support there. Alternatively, run it again with a slurm profile, and check the slurm error and out files for out-of-memory errors, as described in the link above.

Furthermore, another potential source of errors are your module load commands prior to running grenepipe. As described here, this can be difficult. Those modules might overwrite the paths set by conda, leading to conflicting versions, which can cause errors as shown above. Is there a reason you are doing that? Try leaving that out, and just rely on conda/mamba for the package management.

Cheers Lucas

lczech commented 1 year ago

Hi @mrese001,

any updates? If things are working now, shall we close this issue?

Cheers Lucas

mrese001 commented 1 year ago

Hey @lczech !

Thank you for checking in. I still have yet to figure it out and will need time to establish a consistently functional pipeline. I would like to use the cluster resources and need to read up on the proper way to set up a profile. Since I am under some time constraints I had to put a pin on Grenepipe for now but I hope to return to it later. Thank you for your continued help and support!

Mariano

lczech commented 1 year ago

Okay, take your time :-)

Let's close this issue then for now though, as I feel we have answered its original question. If you encounter more issues down the line, feel free to re-open or stat a new one!

Cheers Lucas