harvardinformatics / snpArcher

Snakemake workflow for highly parallel variant calling designed for ease-of-use in non-model organisms.
MIT License
63 stars 30 forks source link

Error in rule create_cov_bed: ModuleNotFoundError: No module named 'pyd4' #144

Closed Erythroxylum closed 4 months ago

Erythroxylum commented 6 months ago

Hello, I just set up snakemake and executed the ecoli test command within a login node and then again with the test partition on the FASRC server. The error is:

Traceback (most recent call last): File "/n/holyscratch01/davis_lab/dwhite/snpArcher/.test/ecoli/.snakemake/scripts/tmpb53k8wei.create_coverage_bed.py", line 5, in from pyd4 import D4File,D4Builder ModuleNotFoundError: No module named 'pyd4' [Wed Dec 13 10:30:55 2023] Error in rule create_cov_bed: jobid: 14 input: results/GCA_000008865.2/summary_stats/all_cov_sumstats.txt, results/GCA_000008865.2/callable_sites/all_samples.d4 output: results/GCA_000008865.2/callable_sites/ecoli_test_callable_sites_cov.bed conda-env: /n/holyscratch01/davislab/dwhite/snpArcher/.test/ecoli/.snakemake/conda/e6e860e6c3364a4f25a50109432a6090

Shutting down, this might take some time. Exiting because a job execution failed. Look above for error message Complete log: .snakemake/log/2023-12-13T091938.314670.snakemake.log

-The .snakemake directory to get the complete log does not exist. Please let me know if there are any logs you would like to see. Thanks for your attention.

cademirch commented 6 months ago

What was your snakemake command? Did you include --use-conda?

Erythroxylum commented 6 months ago

Hi Cade, Yes I did:

snakemake -d .test/ecoli --cores 1 --use-conda

On Wed, Dec 13, 2023 at 12:11 PM Cade Mirchandani @.***> wrote:

What was your snakemake command? Did you include --use-conda?

— Reply to this email directly, view it on GitHub https://github.com/harvardinformatics/snpArcher/issues/144#issuecomment-1854389313, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABLR5LT7TQBTU2E2EDK27WLYJHOV7AVCNFSM6AAAAABATQJM7KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNJUGM4DSMZRGM . You are receiving this because you authored the thread.Message ID: @.***>

cademirch commented 6 months ago

Unfortunately I wasn't able to recreate this by pulling a fresh copy of the repo and running the test. Can you try that?

Erythroxylum commented 6 months ago

I had cloned the repo again and got the same error before posting this issue, but I will try again.

On Wed, Dec 13, 2023 at 2:43 PM Cade Mirchandani @.***> wrote:

Unfortunately I wasn't able to recreate this by pulling a fresh copy of the repo and running the test. Can you try that?

— Reply to this email directly, view it on GitHub https://github.com/harvardinformatics/snpArcher/issues/144#issuecomment-1854599111, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABLR5LTPY3S7HRNK23QTN7TYJIAPZAVCNFSM6AAAAABATQJM7KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNJUGU4TSMJRGE . You are receiving this because you authored the thread.Message ID: @.***>

cademirch commented 6 months ago

Ah okay, sorry. Can you post the snakemake output when it tries to create the conda env?

On Wed, Dec 13, 2023 at 11:47 Dawson White @.***> wrote:

I had cloned the repo again and got the same error before posting this issue, but I will try again.

On Wed, Dec 13, 2023 at 2:43 PM Cade Mirchandani @.***> wrote:

Unfortunately I wasn't able to recreate this by pulling a fresh copy of the repo and running the test. Can you try that?

— Reply to this email directly, view it on GitHub < https://github.com/harvardinformatics/snpArcher/issues/144#issuecomment-1854599111>,

or unsubscribe < https://github.com/notifications/unsubscribe-auth/ABLR5LTPY3S7HRNK23QTN7TYJIAPZAVCNFSM6AAAAABATQJM7KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNJUGU4TSMJRGE>

. You are receiving this because you authored the thread.Message ID: @.***>

— Reply to this email directly, view it on GitHub https://github.com/harvardinformatics/snpArcher/issues/144#issuecomment-1854603889, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKVQJ4VYLFLYSAZZ2OKRDY3YJIA5RAVCNFSM6AAAAABATQJM7KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNJUGYYDGOBYHE . You are receiving this because you commented.Message ID: @.***>

Erythroxylum commented 6 months ago

Here is the main output. I will have to wait to rerun before I can get any more of the log files.

On Wed, Dec 13, 2023 at 2:49 PM Cade Mirchandani @.***> wrote:

Ah okay, sorry. Can you post the snakemake output when it tries to create the conda env?

On Wed, Dec 13, 2023 at 11:47 Dawson White @.***> wrote:

I had cloned the repo again and got the same error before posting this issue, but I will try again.

On Wed, Dec 13, 2023 at 2:43 PM Cade Mirchandani @.***> wrote:

Unfortunately I wasn't able to recreate this by pulling a fresh copy of the repo and running the test. Can you try that?

— Reply to this email directly, view it on GitHub <

https://github.com/harvardinformatics/snpArcher/issues/144#issuecomment-1854599111>,

or unsubscribe <

https://github.com/notifications/unsubscribe-auth/ABLR5LTPY3S7HRNK23QTN7TYJIAPZAVCNFSM6AAAAABATQJM7KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNJUGU4TSMJRGE>

. You are receiving this because you authored the thread.Message ID: @.***>

— Reply to this email directly, view it on GitHub < https://github.com/harvardinformatics/snpArcher/issues/144#issuecomment-1854603889>,

or unsubscribe < https://github.com/notifications/unsubscribe-auth/AKVQJ4VYLFLYSAZZ2OKRDY3YJIA5RAVCNFSM6AAAAABATQJM7KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNJUGYYDGOBYHE>

. You are receiving this because you commented.Message ID: @.***>

— Reply to this email directly, view it on GitHub https://github.com/harvardinformatics/snpArcher/issues/144#issuecomment-1854606453, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABLR5LVFL27K2M5ET5AILATYJIBE5AVCNFSM6AAAAABATQJM7KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNJUGYYDMNBVGM . You are receiving this because you authored the thread.Message ID: @.***>

Erythroxylum commented 6 months ago

I am rebuilding the conda environment and I see that I was activating the snakemake environment instead of the 'snparcher' environment, which could be an issue. In either case, the 'pyd4' package is not listed as an installed package with the following command: (mamba create -c conda-forge -c bioconda -n snparcher snakemake) But I will try again in the snparcher env instead of snakemake

On Wed, Dec 13, 2023 at 2:52 PM Dawson White @.***> wrote:

Here is the main output. I will have to wait to rerun before I can get any more of the log files.

On Wed, Dec 13, 2023 at 2:49 PM Cade Mirchandani @.***> wrote:

Ah okay, sorry. Can you post the snakemake output when it tries to create the conda env?

On Wed, Dec 13, 2023 at 11:47 Dawson White @.***> wrote:

I had cloned the repo again and got the same error before posting this issue, but I will try again.

On Wed, Dec 13, 2023 at 2:43 PM Cade Mirchandani @.***> wrote:

Unfortunately I wasn't able to recreate this by pulling a fresh copy of the repo and running the test. Can you try that?

— Reply to this email directly, view it on GitHub <

https://github.com/harvardinformatics/snpArcher/issues/144#issuecomment-1854599111>,

or unsubscribe <

https://github.com/notifications/unsubscribe-auth/ABLR5LTPY3S7HRNK23QTN7TYJIAPZAVCNFSM6AAAAABATQJM7KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNJUGU4TSMJRGE>

. You are receiving this because you authored the thread.Message ID: @.***>

— Reply to this email directly, view it on GitHub < https://github.com/harvardinformatics/snpArcher/issues/144#issuecomment-1854603889>,

or unsubscribe < https://github.com/notifications/unsubscribe-auth/AKVQJ4VYLFLYSAZZ2OKRDY3YJIA5RAVCNFSM6AAAAABATQJM7KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNJUGYYDGOBYHE>

. You are receiving this because you commented.Message ID: @.***>

— Reply to this email directly, view it on GitHub https://github.com/harvardinformatics/snpArcher/issues/144#issuecomment-1854606453, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABLR5LVFL27K2M5ET5AILATYJIBE5AVCNFSM6AAAAABATQJM7KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNJUGYYDMNBVGM . You are receiving this because you authored the thread.Message ID: @.***>

tsackton commented 6 months ago

snakemake will build its own conda environment for each rule; all you need in the snakemake/snparcher env is snakemake itself.

Erythroxylum commented 6 months ago

Hello again, The same error appeared. Here is the snakemake log:

On Wed, Dec 13, 2023 at 3:03 PM Tim Sackton @.***> wrote:

snakemake will build its own conda environment for each rule; all you need in the snakemake/snparcher env is snakemake itself.

— Reply to this email directly, view it on GitHub https://github.com/harvardinformatics/snpArcher/issues/144#issuecomment-1854623790, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABLR5LRJACCRWRUKUJRCX43YJICYXAVCNFSM6AAAAABATQJM7KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNJUGYZDGNZZGA . You are receiving this because you authored the thread.Message ID: @.***>

tsackton commented 6 months ago

I think this has something to do with the mamba/conda setup. I cannot replicate this error either, even on the same cluster. Can you perhaps list all the commands you are running exactly as you typed them? Not just the snakemake command, but basically everything you type from when you log on to the cluster to when you get the error message?

Erythroxylum commented 6 months ago

Well, thank you both for your attention. I am not sure what changed, but it has now completed the job two times. You can close the issue. Here is the code:

srun --pty -p test --mem 10000 -c 1 -t 120 /bin/bash

module load python

mamba activate snparcher

snakemake -d .test/ecoli --cores 1 --use-conda

On Wed, Dec 13, 2023 at 5:14 PM Tim Sackton @.***> wrote:

I think this has something to do with the mamba/conda setup. I cannot replicate this error either, even on the same cluster. Can you perhaps list all the commands you are running exactly as you typed them? Not just the snakemake command, but basically everything you type from when you log on to the cluster to when you get the error message?

— Reply to this email directly, view it on GitHub https://github.com/harvardinformatics/snpArcher/issues/144#issuecomment-1854782293, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABLR5LX6P6UL54QNE7AAIGTYJISFTAVCNFSM6AAAAABATQJM7KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNJUG44DEMRZGM . You are receiving this because you authored the thread.Message ID: @.***>

tsackton commented 6 months ago

Reopening this issue because I suspec this has to do with a conflict between the python version that is specified in https://github.com/harvardinformatics/snpArcher/blob/main/workflow/envs/cov_filter.yml and the python version the user builds snpArcher against.

I suspect we need to remove the python from the cov_filter.yml

Dictionary2b commented 5 months ago

I had the same error on the previous attempt. It was solved by manually installing the pyd4 package to the env. The alternative is pinning the Python version to 3.10 since the default in my system was 3.11, where snparcher would have difficulty installing pyd4. I haven't test the second solution.

simonharnqvist commented 4 months ago

After a lot of swearing at the cluster, I finally got this to work - the trick (on my system) is to install pyd4 into the main snparcher environment, not just into the cov_filter one.

This might have something to do with the fact that I've had to install each of the snparcher environments manually - Snakemake insisted on trying (and failing) to re-create the environments when running offline, even after pre-installing with --use-conda --conda-create-envs-only.

Either way, it's an easy workaround and might hopefully help someone else with this issue.