zreitz / multismash

A workflow and scripts for large-scale antiSMASH analyses
GNU Affero General Public License v3.0
30 stars 1 forks source link

config.schema.yaml missing, possibly related to snakemake schema #19

Open intikhab opened 2 weeks ago

intikhab commented 2 weeks ago

Hi There,

The multismash is a welcome development. We installed this as a module and while testing it complaints about FileNotFoundError, as below. Is this related to snakemake standard schema yaml file?

Please see if you can help us understand the problem and possible fix.

Best Wishes, Intikhab

multismash config-example.yaml -n Traceback (most recent call last): File "/ibex/sw/rl9c/multismash/0.4.0/rl9.1_conda/Miniconda3/envs/antismash/bin/multismash", line 8, in sys.exit(main()) File "/ibex/sw/rl9c/multismash/0.4.0/rl9.1_conda/Miniconda3/envs/antismash/lib/python3.10/site-packages/multismash.py", line 55, in main validate(configs, schema=str(schema)) File "/ibex/sw/rl9c/multismash/0.4.0/rl9.1_conda/Miniconda3/envs/antismash/lib/python3.10/site-packages/snakemake/utils.py", line 68, in validate schema = _load_configfile(source, filetype="Schema") File "/ibex/sw/rl9c/multismash/0.4.0/rl9.1_conda/Miniconda3/envs/antismash/lib/python3.10/site-packages/snakemake/io.py", line 1694, in _load_configfile obj = open(configpath_or_obj, encoding="utf-8") FileNotFoundError: [Errno 2] No such file or directory: '/ibex/sw/rl9c/multismash/0.4.0/rl9.1_conda/Miniconda3/envs/antismash/lib/python3.10/workflow/schema/config.schema.yaml'

zreitz commented 2 weeks ago

Hi Intikhab, I just fixed a similar error over the weekend. Can you please re-download the repository to get the latest version, 0.5.0, and try again? (there's no release yet, but the master branch is up to date)

IBEXCluster commented 2 weeks ago

Thanks @zreitz for your help. We updated multiSMASH-0.5.0.

Successfully installed multiSMASH-0.5.0

However, the --dry-run failed because of the following error.

$ multismash -n example/config-example.yaml 
Traceback (most recent call last):
  File "/ibex/sw/rl9c/multismash/0.4.0/rl9.1_conda/Miniconda3/envs/antismash/bin/multismash", line 8, in <module>
    sys.exit(main())
  File "/ibex/sw/rl9c/multismash/0.4.0/rl9.1_conda/Miniconda3/envs/antismash/lib/python3.10/site-packages/multismash/multismash.py", line 61, in main
    with Path.open(args.configfile) as yml:
  File "/ibex/sw/rl9c/multismash/0.4.0/rl9.1_conda/Miniconda3/envs/antismash/lib/python3.10/pathlib.py", line 1119, in open
    return self._accessor.open(self, mode, buffering, encoding, errors,
AttributeError: 'str' object has no attribute '_accessor'

Any advice? Thanks in advance.

zreitz commented 2 weeks ago

Sorry about that, I forgot to test some recent changes in earlier versions of python. Does the example run successfully in the latest version, 0.5.1?

IBEXCluster commented 2 weeks ago

Dear @zreitz Many thanks for helping. I updated multiSMASH-0.5.1

Successfully installed multiSMASH-0.5.1

and its fixes all the issues. πŸ‘

$ multismash -n example/config-example.yaml 
Running multiSMASH with 3 cores
3 gbff.gz files found
Building DAG of jobs...
Job stats:
job                 count
----------------  -------
all                     1
count_regions           1
run_antismash           3
tabulate_regions        1
total                   6

Thanks a lot for all your effort and support. πŸ™ May be, it's good to release this version for other users!!

intikhab commented 2 weeks ago

Hi Naga, and zreitz,

Thanks for updating the version. dry run goes well. However when I test a few fasta files of test genomes, it fails as below:

$ multismash test.config.yaml Running multiSMASH with 8 cores 9 fa files found Building DAG of jobs... Using shell: /usr/bin/bash Provided cores: 8 Rules claiming more threads will be scaled down. Job stats: job count


all 1 count_regions 1 run_antismash 9 tabulate_regions 1 total 12

Select jobs to execute...

[Tue Nov 5 23:28:21 2024] ...

/usr/bin/bash: line 1: conda: command not found Traceback (most recent call last): ... subprocess.CalledProcessError: Command 'conda info --json' returned non-zero exit status 127.

It seems it is looking for conda, that is missing from this installation?

Additionally I note the following scripts are not available in the path:

tabulate_regions.py, count_regions.py

Please help.

Thanks, IA

-- Intikhab Alam, PhD

Senior Research Scientist CEMSE Division, Building #3, Office #4328 4700 King Abdullah University of Science and Technology (KAUST) Thuwal 23955-6900, KSA W: http://www.kaust.edu.sahttps://webmail.kaust.edu.sa/owa/redir.aspx?C=wkduJ0ChSE-OkyUQwL9vutDH6L5Gg9EImiJ7GyYOxcPLuActd9iwo85DHDgQZup2zR1MyXCk7as.&URL=http%3a%2f%2fwww.kaust.edu.sa T +966 (0) 2 808-2423 F +966 (2) 802 0127


From: IBEXCluster @.> Sent: Tuesday, November 5, 2024 22:16 To: zreitz/multismash @.> Cc: Intikhab Alam @.>; Author @.> Subject: [EXTERNAL] Re: [zreitz/multismash] config.schema.yaml missing, possibly related to snakemake schema (Issue #19)

Dear @zreitzhttps://urldefense.com/v3/__https://github.com/zreitz__;!!Nmw4Hv0!zoP9mw9RCCatHTwUTFkyH0WDyVE8wVhkj31d1x8IANfs_KcLdmAAAcTp0yLvJsDNVenTzoNBO3wP02q515rAGQ82giV2_Sw$ Many thanks for helping. I updated multiSMASH-0.5.1

Successfully installed multiSMASH-0.5.1

and its fixes all the issues. πŸ‘

$ multismash -n example/config-example.yaml Running multiSMASH with 3 cores 3 gbff.gz files found Building DAG of jobs... Job stats: job count


all 1 count_regions 1 run_antismash 3 tabulate_regions 1 total 6

Thanks a lot for all your effort and support. πŸ™ May be, it's good to release this version for other users!!

β€” Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https://github.com/zreitz/multismash/issues/19*issuecomment-2457969197__;Iw!!Nmw4Hv0!zoP9mw9RCCatHTwUTFkyH0WDyVE8wVhkj31d1x8IANfs_KcLdmAAAcTp0yLvJsDNVenTzoNBO3wP02q515rAGQ822TKv8Qw$, or unsubscribehttps://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AAV63ERYQJGVWAZLAZKBTU3Z7EKSPAVCNFSM6AAAAABREBNZH2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINJXHE3DSMJZG4__;!!Nmw4Hv0!zoP9mw9RCCatHTwUTFkyH0WDyVE8wVhkj31d1x8IANfs_KcLdmAAAcTp0yLvJsDNVenTzoNBO3wP02q515rAGQ82pEvgeYE$. You are receiving this because you authored the thread.Message ID: @.***>

zreitz commented 2 weeks ago

May be, it's good to release this version for other users!!

Seems like I better wait until you all are happy ;) I appreciate the bug reports; your installation is slightly different from mine and its revealing some of my oversights.

It seems it is looking for conda, that is missing from this installation?

It was looking for conda even when it's unnecessary. I've changed that now!

Additionally I note the following scripts are not available in the path: tabulate_regions.py, count_regions.py

That's correct, but I suppose there's not a reason not to add them. I will reply here when I've done so. In the meantime, please just use python path/to/tabulate_regions.py <args>

Thanks for using multiSMASH, and let me know if there are any remaining problems!

intikhab commented 2 weeks ago

Hi All,

Multismash is now working but without bigscape.

We have big-scape/1.1.5, when loaded via module load big-scape/1.1.5

When , pfam dir path is provided in the config file, bigscape flag is set to true and bigscape is available in the environment, multismash throws the following error about mamba.

Running multiSMASH with 32 cores 9 fa files found Building DAG of jobs... CreateCondaEnvironmentException: The 'mamba' command is not available in the shell /usr/bin/bash that will be used by Snakemake. You have to ensure that it is in your PATH, e.g., first activating the conda base environment with conda activate base.The mamba package manager (https://github.com/mamba-org/mamba) is a fast and robust conda replacement. It is the recommended way of using Snakemake's conda integration. It can be installed with conda install -n base -c conda-forge mamba. If you still prefer to use conda, you can enforce that by setting --conda-frontend conda.

Any suggestions to fix this error?

Best, IA

-- Intikhab Alam, PhD

Senior Research Scientist CEMSE Division, Building #3, Office #4328 4700 King Abdullah University of Science and Technology (KAUST) Thuwal 23955-6900, KSA W: http://www.kaust.edu.sahttps://webmail.kaust.edu.sa/owa/redir.aspx?C=wkduJ0ChSE-OkyUQwL9vutDH6L5Gg9EImiJ7GyYOxcPLuActd9iwo85DHDgQZup2zR1MyXCk7as.&URL=http%3a%2f%2fwww.kaust.edu.sa T +966 (0) 2 808-2423 F +966 (2) 802 0127


From: zreitz @.> Sent: Wednesday, November 6, 2024 00:28 To: zreitz/multismash @.> Cc: Intikhab Alam @.>; Author @.> Subject: [EXTERNAL] Re: [zreitz/multismash] config.schema.yaml missing, possibly related to snakemake schema (Issue #19)

May be, it's good to release this version for other users!!

Seems like I better wait until you all are happy ;) I appreciate the bug reports; your installation is slightly different from mine and its revealing some of my oversights.

It seems it is looking for conda, that is missing from this installation?

It was looking for conda even when it's unnecessary. I've changed that now!

Additionally I note the following scripts are not available in the path: tabulate_regions.py, count_regions.py

That's correct, but I suppose there's not a reason not to add them. I will reply here when I've done so. In the meantime, please just use python path/to/tabulate_regions.py

Thanks for using multiSMASH, and let me know if there are any remaining problems!

β€” Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https://github.com/zreitz/multismash/issues/19*issuecomment-2458184104__;Iw!!Nmw4Hv0!xovqNJUmwAkqE6ty806nF0av8ksSDhkJLTeu41c5UbwnPjtysHHTjuy6vxqfFqo8H_xH6krII7SpAgfl8sTOsbIrWOcUvLw$, or unsubscribehttps://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AAV63EU2SKJP2XORWJWUBRDZ7E2BDAVCNFSM6AAAAABREBNZH2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINJYGE4DIMJQGQ__;!!Nmw4Hv0!xovqNJUmwAkqE6ty806nF0av8ksSDhkJLTeu41c5UbwnPjtysHHTjuy6vxqfFqo8H_xH6krII7SpAgfl8sTOsbIrbPsO_qk$. You are receiving this because you authored the thread.Message ID: @.***>

zreitz commented 2 weeks ago

There's still something about your installation that I haven't accounted for yet. Do you have antismash, bigscape, and multismash all installed within the same environment?

intikhab commented 2 weeks ago

Hi Intikhab, mamba also added in the module multismash. Please try again.

Thanks and Regards, Naga

From: Intikhab Alam @.> Date: Wednesday, 6 November 2024 at 11:56β€―PM To: zreitz/multismash @.>, zreitz/multismash @.> Cc: Author @.>, Nagarajan Kathiresan @.***> Subject: Re: [EXTERNAL] Re: [zreitz/multismash] config.schema.yaml missing, possibly related to snakemake schema (Issue #19) Hi All,

Multismash is now working but without bigscape.

We have big-scape/1.1.5, when loaded via module load big-scape/1.1.5

When , pfam dir path is provided in the config file, bigscape flag is set to true and bigscape is available in the environment, multismash throws the following error about mamba.

Running multiSMASH with 32 cores 9 fa files found Building DAG of jobs... CreateCondaEnvironmentException: The 'mamba' command is not available in the shell /usr/bin/bash that will be used by Snakemake. You have to ensure that it is in your PATH, e.g., first activating the conda base environment with conda activate base.The mamba package manager (https://github.com/mamba-org/mamba) is a fast and robust conda replacement. It is the recommended way of using Snakemake's conda integration. It can be installed with conda install -n base -c conda-forge mamba. If you still prefer to use conda, you can enforce that by setting --conda-frontend conda.

Any suggestions to fix this error?

Best, IA

-- Intikhab Alam, PhD

Senior Research Scientist CEMSE Division, Building #3, Office #4328 4700 King Abdullah University of Science and Technology (KAUST) Thuwal 23955-6900, KSA W: http://www.kaust.edu.sahttps://webmail.kaust.edu.sa/owa/redir.aspx?C=wkduJ0ChSE-OkyUQwL9vutDH6L5Gg9EImiJ7GyYOxcPLuActd9iwo85DHDgQZup2zR1MyXCk7as.&URL=http%3a%2f%2fwww.kaust.edu.sa T +966 (0) 2 808-2423 F +966 (2) 802 0127


From: zreitz @.> Sent: Wednesday, November 6, 2024 00:28 To: zreitz/multismash @.> Cc: Intikhab Alam @.>; Author @.> Subject: [EXTERNAL] Re: [zreitz/multismash] config.schema.yaml missing, possibly related to snakemake schema (Issue #19)

May be, it's good to release this version for other users!!

Seems like I better wait until you all are happy ;) I appreciate the bug reports; your installation is slightly different from mine and its revealing some of my oversights.

It seems it is looking for conda, that is missing from this installation?

It was looking for conda even when it's unnecessary. I've changed that now!

Additionally I note the following scripts are not available in the path: tabulate_regions.py, count_regions.py

That's correct, but I suppose there's not a reason not to add them. I will reply here when I've done so. In the meantime, please just use python path/to/tabulate_regions.py

Thanks for using multiSMASH, and let me know if there are any remaining problems!

β€” Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https:/github.com/zreitz/multismash/issues/19*issuecomment-2458184104__;Iw!!Nmw4Hv0!xovqNJUmwAkqE6ty806nF0av8ksSDhkJLTeu41c5UbwnPjtysHHTjuy6vxqfFqo8H_xH6krII7SpAgfl8sTOsbIrWOcUvLw$, or unsubscribehttps://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/AAV63EU2SKJP2XORWJWUBRDZ7E2BDAVCNFSM6AAAAABREBNZH2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINJYGE4DIMJQGQ__;!!Nmw4Hv0!xovqNJUmwAkqE6ty806nF0av8ksSDhkJLTeu41c5UbwnPjtysHHTjuy6vxqfFqo8H_xH6krII7SpAgfl8sTOsbIrbPsO_qk$. You are receiving this because you authored the thread.Message ID: @.***>

IBEXCluster commented 2 weeks ago

@intikhab added mamba also in the module multismash. Please try to rerun.

intikhab commented 2 weeks ago

Hi Naga,

At the bigscape step it fails asking for environment location for big-scape:

EnvironmentLocationNotFound: Not a conda environment: /....big-scape/1.1.5

/usr/bin/bash: line 1: bigscape.py: command not found [Thu Nov 7 11:05:05 2024] Error in rule run_bigscape: jobid: 13 input: ... output: ..multismash/bigscape/index.html conda-env: big-scape/1.1.5 shell: bigscape.py -i ../multismash/antismash -o ../multismash/bigscape -c 32 --pfam_dir /sw/csi/big-scape/nov26/el7_python3/Pfam --mibig --mix --no_classify --include_singletons --clans-off --cutoffs 0.5 (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

Exiting because a job execution failed. Look above for error message

Perhaps big-scape needs to be in the same environment as multismash?

Intikhab

-- Intikhab Alam, PhD

Senior Research Scientist CEMSE Division, Building #3, Office #4328 4700 King Abdullah University of Science and Technology (KAUST) Thuwal 23955-6900, KSA W: http://www.kaust.edu.sahttps://webmail.kaust.edu.sa/owa/redir.aspx?C=wkduJ0ChSE-OkyUQwL9vutDH6L5Gg9EImiJ7GyYOxcPLuActd9iwo85DHDgQZup2zR1MyXCk7as.&URL=http%3a%2f%2fwww.kaust.edu.sa T +966 (0) 2 808-2423 F +966 (2) 802 0127


From: IBEXCluster @.> Sent: Thursday, November 7, 2024 09:09 To: zreitz/multismash @.> Cc: Intikhab Alam @.>; Mention @.> Subject: [EXTERNAL] Re: [zreitz/multismash] config.schema.yaml missing, possibly related to snakemake schema (Issue #19)

@intikhabhttps://urldefense.com/v3/__https://github.com/intikhab__;!!Nmw4Hv0!yHGGu4ItP-BSlQ79BrV_vE5wEXY2hqG3LBW5DKk1Wfo9fpfdno_KhvYcbWY5UDo4QI1JZ_B1cJK0T24-L2j2NVxs_C4hQqM$ added mamba also in the module. Please try to rerun.

β€” Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https://github.com/zreitz/multismash/issues/19*issuecomment-2461405447__;Iw!!Nmw4Hv0!yHGGu4ItP-BSlQ79BrV_vE5wEXY2hqG3LBW5DKk1Wfo9fpfdno_KhvYcbWY5UDo4QI1JZ_B1cJK0T24-L2j2NVxsmanZr8s$, or unsubscribehttps://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AAV63ETHRCA2GSOYRRVIWJLZ7L7ZJAVCNFSM6AAAAABREBNZH2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINRRGQYDKNBUG4__;!!Nmw4Hv0!yHGGu4ItP-BSlQ79BrV_vE5wEXY2hqG3LBW5DKk1Wfo9fpfdno_KhvYcbWY5UDo4QI1JZ_B1cJK0T24-L2j2NVxsImdh4Hs$. You are receiving this because you were mentioned.Message ID: @.***>

zreitz commented 2 weeks ago

No, it doesn't. You still haven't told me your installation set up. I'm probably going to have to add an option in the configuration file, but I can't do that unless I understand your system.

What kind of environment are you using? Do you have antismash, bigscape, and multismash all installed within the same environment or do you have to call a command to switch between environments?

intikhab commented 1 week ago

Hi Naga,

A last step in this multismash installation is to add big-slice in the same module.

Apart from big-slice step this multismash module is now working.

Are you able to add big-slice in the same module so that It works independently?

Here are the last lines in the config that require information about big-scape:

-----------< Change these if you have a non-standard installation >-----------

Only set this if antiSMASH is in a different environment from multiSMASH

antismash_conda_env_name: antismash_command: antismash # Or maybe python /path/to/run_antismash.py

By default, a new BiG-SCAPE conda environment is automatically installed

the first time multiSMASH is run with the flag [run_bigscape: True].

If you already have a BiG-SCAPE environment that you want to use,

put the environment name here.

bigscape_conda_env_name: big-scape/1.1.5 bigscape_command: bigscape.py # Maybe "bigscape.py" for some versions

BiG-SCAPE also requires a hmmpress'd Pfam database (Pfam-A.hmm plus .h3* files).

By default, multiSMASH uses antiSMASH's Pfam directory. If antiSMASH isn't installed,

or multiSMASH instructs you to do so, set this to the directory containing Pfam-A.hmm.

pfam_dir: /sw/csi/big-scape/nov26/el7_python3/Pfam # Relative paths are relative to THIS file!

Many Thanks,

Intikhab

-- Intikhab Alam, PhD

Senior Research Scientist CEMSE Division, Building #3, Office #4328 4700 King Abdullah University of Science and Technology (KAUST) Thuwal 23955-6900, KSA W: http://www.kaust.edu.sahttps://webmail.kaust.edu.sa/owa/redir.aspx?C=wkduJ0ChSE-OkyUQwL9vutDH6L5Gg9EImiJ7GyYOxcPLuActd9iwo85DHDgQZup2zR1MyXCk7as.&URL=http%3a%2f%2fwww.kaust.edu.sa T +966 (0) 2 808-2423 F +966 (2) 802 0127


From: zreitz @.> Sent: Thursday, November 7, 2024 18:43 To: zreitz/multismash @.> Cc: Intikhab Alam @.>; Mention @.> Subject: [EXTERNAL] Re: [zreitz/multismash] config.schema.yaml missing, possibly related to snakemake schema (Issue #19)

No, it doesn't. You still haven't told me your installation set up. I'm probably going to have to add an option in the configuration file, but I can't do that unless I understand your system.

What kind of environment are you using? Do you have antismash, bigscape, and multismash all installed within the same environment or do you have to call a command to switch between environments?

β€” Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https://github.com/zreitz/multismash/issues/19*issuecomment-2462564843__;Iw!!Nmw4Hv0!3m7TrSak0h7VvCz-gRNkDZwNW4rkNayiHd2zNn7KLJ2RM66zebCGBCLExOb2c4Gjytwv6OujrTbxeFUvg48ECniKXThpxtI$, or unsubscribehttps://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AAV63ERUT4JQXJGJ6EOUMMLZ7ODBJAVCNFSM6AAAAABREBNZH2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINRSGU3DIOBUGM__;!!Nmw4Hv0!3m7TrSak0h7VvCz-gRNkDZwNW4rkNayiHd2zNn7KLJ2RM66zebCGBCLExOb2c4Gjytwv6OujrTbxeFUvg48ECniKC__6NfM$. You are receiving this because you were mentioned.Message ID: @.***>

IBEXCluster commented 1 week ago

What kind of environment are you using? Do you have antismash, bigscape, and multismash all installed within the same environment or do you have to call a command to switch between environments?

Dear @zreitz Many thanks for all your suggestions. We do have antismash and multismash in the same environment. For @intikhab testing, we are trying to add bigslice in the same environment.

Are you able to add big-slice in the same module so that It works independently?

Dear @intikhab Added bigslice in the same environment. Please give it a try now.