protfasta - a robust parser for protein-based FASTA files.
For all documentation see https://protfasta.readthedocs.io/en/latest/.
For code see https://github.com/holehouse-lab/protfasta.
protfasta
has been tested on Linux and macOS. It should also work on Windows but we haven't tested it there yet.
protfasta
can be downloaded and installed directly from PyPI using pip:
pip install protfasta
If this has worked, the pfasta
command-line tool should be available from the command-line
pfasta --help
And you're done. This also means you can now import
and use protfasta in your Python workflow.
import protfasta
# sequences is now a dictionary where keys are FASTA headers and values are sequences.
sequences = protfasta.read_fasta('inputfile.fasta')
For bug reports or errors please raise an issue on this github repository (see the Issues tab at the top).
0.1.14 and 0.1.15 (October 2024) - Re-wrote build chain and versioning to use pyproject.toml
and versioningit. protfasta should now support Python beyond 3.12. About bloody time.
--version
flag to pfasta0.1.13 (January 2023) - Added upper limit of Python 3.11 to accomodate clash between versioneer and Python 3.12. Ultimately we'll move to versioningit for release versioning (as we have done internally) but need to make sure we have a robust protocol for this switch and then do this for ALL tools....
0.1.12 (March 2023) - integrated in check_header_parser flag via pull request from the amazing Friedlab !
Added in append_to_fasta
flag so you can append to an existing FASTA file (thanks Ryan!)
0.1.11 (Sept 17th 2022) - re-wrote code for checking duplicate sequence to make it O(1) instead of O(n) for number of sequences (:-/) and added convert-remove option for invalid_sequences
0.1.9 (Sept 12th 2021) - added in robustness for whitespace in sequence files, which, bizarrely, was not present (i.e. added as an invalid residue type but can now be converted).
Copyright (c) 2020-2021, Alex Holehouse - Holehouse lab. protfasta
is released under the MIT license. The codebase is well structured and relatively simple, lending it to feature addition. We welcome pull-requests assuming contributed code maintains an appropriate level of clarity and robustness.
Many of the software-engineering tools and approaches used in the development of protfasta
come from resources developed by the Molecular Sciences Software Institute.