scoring-matrices
Dependency free, Cython-compatible scoring matrices to use with biological sequences.
Scoring Matrices are matrices used to score the matches and mismatches between two characters are the same position in a sequence alignment. Some of these matrices are derived from substitution matrices, which uses evolutionary modeling.
The scoring-matrices
package is a dependency-free, batteries included library
to handle and distribute common substitution matrices:
ScoringMatrix
is a Cython class that can be
inherited, and the matrix data can be accessed as either a raw pointer, or
a typed memoryview.scoring-matrices
can be installed directly from PyPI,
which hosts some pre-built wheels for the x86-64 architecture (Linux/OSX/Windows)
and the Aarch64 architecture (Linux/OSX), as well as the code required to
compile from source with Cython:
$ pip install scoring-matrices
Otherwise, scoring-matrices
is also available as a Bioconda
package:
$ conda install bioconda::scoring-matrices
ScoringMatrix
class from the installed module:
from scoring_matrices import ScoringMatrix
blosum62 = ScoringMatrix.from_name("BLOSUM62")
x = blosum62[0, 0]
y = blosum62['A', 'A']
row_x = blosum62[0]
row_y = blosum62['A']
Access the matrix weights as raw pointers to constant data:
from scoring_matrices cimport ScoringMatrix
cdef ScoringMatrix blosum = ScoringMatrix.from_name("BLOSUM62")
cdef const float* data = blosum.data_ptr() # dense array
cdef const float** matrix = blosum.matrix_ptr() # array of pointers
Access the ScoringMatrix
weights as a typed memoryview
using the buffer protocol in more recents versions of Python:
from scoring_matrices cimport ScoringMatrix
cdef ScoringMatrix blosum = ScoringMatrix.from_name("BLOSUM62")
cdef const float[:, :] weights = blosum
Found a bug ? Have an enhancement request ? Head over to the GitHub issue tracker if you need to report or ask something. If you are filing in on a bug, please include as much information as you can about the issue, and try to recreate the same bug in a simple, easily reproducible situation.
Contributions are more than welcome! See
CONTRIBUTING.md
for more details.
This project adheres to Semantic Versioning and provides a changelog in the Keep a Changelog format.
This library is provided under the MIT License. Matrices were collected from the MMseqs2, Biopython and NCBI BLAST+ sources and are believed to be in the public domain.
This project was developed by Martin Larralde during his PhD project at the Leiden University Medical Center in the Zeller team.