akcorut / kGWASflow

kGWASflow is a Snakemake workflow for performing k-mers-based GWAS.
https://github.com/akcorut/kGWASflow/wiki
MIT License
28 stars 8 forks source link

Error in rule merge_kmers #5

Closed jkitony closed 1 year ago

jkitony commented 1 year ago

I get below error while running your workflow, is it something to do with computing resources, sample or bug in the scripts! Kindly let me know how to handle it. Thanks

output: results/kmers_count/SbicPI554650_PH449/kmers_with_strand, results/kmers_count/SbicPI554650_PH449/kmers_add_strand_information.done "Error in rule merge_kmers: jobid: 234 input: results/kmers_count/SbicPI554650_PH449/output_kmc_canon.kmc_suf, results/kmers_count/SbicPI554650_PH449/output_kmc_canon.kmc_pre, results/kmers_count/SbicPI554650_PH449/output_kmc_all.kmc_suf, results/kmers_count/SbicPI554650_PH449/output_kmc_all.kmc_pre, results/kmers_count/SbicPI554650_PH449/kmc_canonical.done, results/kmers_count/SbicPI554650_PH449/kmc_non-canonical.done, scripts/external/kmers_gwas/bin output: results/kmers_count/SbicPI554650_PH449/kmers_with_strand, results/kmers_count/SbicPI554650_PH449/kmers_add_strand_information.done log: logs/count_kmers/kmc/SbicPI554650_PH449/addstrand.log.out (check log file(s) for error message) conda-env: /mnt/dev/Sbic/kGWASflow/.snakemake/conda/b8b1d34c68a758bf1dd7d09a808ce517 shell:

    export LD_LIBRARY_PATH=$CONDA_PREFIX/lib

    scripts/external/kmers_gwas/bin/kmers_add_strand_information -c results/kmers_count/SbicPI554650_PH449/output_kmc_canon -n results/kmers_count/SbicPI554650_PH449/output_kmc_all -k 31 -o results/kmers_count/SbicPI554650_PH449/kmers_with_strand > logs/count_kmers/kmc/SbicPI554650_PH449/add_strand.log.out

    (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

Shutting down, this might take some time. Exiting because a job execution failed. Look above for error message Error! The Snakemake workflow aborted."

akcorut commented 1 year ago

Hey @jkitony,

This is most likely caused by memory issues since this step is the most memory intensive step of the pipeline. Would you be able to increase the memory or change the number of threads you use and try again?

Let me know if that fixes the issue.

Best, Kivanc

jkitony commented 1 year ago

Yes, worked with increased memory, thanks. However, Manhattan plots were not generated, running again with more datasets. Btw point me to the documentation if available, wanna know 1. How missing phenos can be presented (i.e NA) 2. whether categorical phenos can be analyzed, and 3. adjusting significance threshold in the workflow.

akcorut commented 1 year ago

Dear @jkitony,

Sorry for missing this comment. I have been working on the further development of the pipeline. The latest version of kGWASflow is released (v1.2.3) and is available on Bioconda. Please let me know if you are still having issues with Manhattan plots.

  1. How missing phenos can be presented (i.e NA)

You can remove the samples with missing phenos from the phenotype file (pheno_name.pheno) and run the analysis.

  1. whether categorical phenos can be analyzed

Both categorical and quantitative phenotypes can be analyzed with this pipeline.

  1. adjusting significance threshold in the workflow.

There is only two different significance thresholds in the kmersGWAS step, %5 and %10 family-wise error rate thresholds. If you would like to get a results table based on %10 family-wise error rate thresholds, you can simply activate it using the config file: https://github.com/akcorut/kGWASflow/blob/39cbb1afff7f08e89ae6d824bd605d583ef7fb7b/config/config.yaml#L261-L268

Let me know if you have any other questions.

Kivanc

jkitony commented 1 year ago

Thanks; I will give it another shot.

On Wed, Jul 12, 2023 at 7:32 AM Kivanc Corut @.***> wrote:

Dear @jkitony https://github.com/jkitony,

Sorry for missing this comment. I have been working on the further development of the pipeline. The latest version of kGWASflow is released ( v1.2.3 https://github.com/akcorut/kGWASflow/releases/tag/v1.2.3) and is available on Bioconda https://anaconda.org/bioconda/kgwasflow. Please let me know if you are still having issues with Manhattan plots.

  1. How missing phenos can be presented (i.e NA) You can remove the samples with missing phenos from the phenotype file (pheno_name.pheno) and run the analysis.

  2. whether categorical phenos can be analyzed Both categorical and quantitative phenotypes can be analyzed with this pipeline.

  3. adjusting significance threshold in the workflow. There is only two different significance thresholds in the kmersGWAS step, %5 and %10 family-wise error rate thresholds. If you would like to get a results table based on %10 family-wise error rate thresholds, you can simply activate it using the config file:

    https://github.com/akcorut/kGWASflow/blob/39cbb1afff7f08e89ae6d824bd605d583ef7fb7b/config/config.yaml#L261-L268

Let me know if you have any other questions.

Kivanc

— Reply to this email directly, view it on GitHub https://github.com/akcorut/kGWASflow/issues/5#issuecomment-1632640367, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABJXGSGAWK6Z6SMGAFPVRWTXP2YO7ANCNFSM6AAAAAAUPOGE7A . You are receiving this because you were mentioned.Message ID: @.***>

akcorut commented 1 year ago

Great to hear. I'm closing this issue for now but feel free to open a new one anytime you have any questions or problems running the pipeline.

Thanks.