genotoul-bioinfo / Binette

A fast and accurate binning refinement tool to constructs high quality MAGs from the output of multiple binning tools.
https://binette.readthedocs.io
MIT License
20 stars 1 forks source link

Provide Precomputed Protein Sequences #31

Closed JeanMainguy closed 3 days ago

JeanMainguy commented 3 days ago

This PR adds the ability to provide precomputed protein sequences as input, eliminating the need for Binette to perform gene prediction. This feature addresses the request in issue #30.

Usage

A new argument, --proteins, has been introduced. This should point to a FASTA file containing all predicted proteins for the assembly.

Requirements

The provided protein sequences must follow Pyrodigal’s naming convention: <contigID>_<GeneID>.

If a contig extracted from the protein file is not present in the provided contig file, the program will raise an error to ensure consistency between inputs.