An Efficient and Ergonomic Python Binding Library for BLAT
When conducting extensive queries, using the blat
of BLAT
suite can prove to be quite inefficient, especially if these operations aren't grouped. The tasks are allocated sporadically, often interspersed among other tasks.
In general, the choice narrows down to either utilizing blat
or combining gfServer
with gfClient
.
Indeed, blat
is a program that launches gfServer
, conducts the sequence query via gfClient
, and then proceeds to terminate the server.
This approach is far from ideal when performing numerous queries that aren't grouped since blat
repeatedly initializes and shuts down gfServer
for each query, resulting in substantial overhead.
This overhead consists of the time required for the server to index the reference, contingent on the reference's size.
To index the human genome (hg38), for example, would take approximately five minutes.
A more efficient solution would involve initializing gfServer
once and invoking gfClient
multiple times for the queries.
However, gfServer
and gfClient
are only accessible via the command line.
This necessitates managing system calls (for instance, subprocess
or os.system
), intermediate temporary files, and format conversion, further diminishing performance.
That is why PxBLAT
holds its position.
It resolves the issues mentioned above while introducing handy features like port retry
, use current running server
, etc.
PxBLAT
aims for a seamless user experience.PxBLAT
operates independently without any external dependencies.PxBLAT
monitors its status internally.PxBLAT
doesn't require you to worry about file formats.PxBLAT
discards the need for intermediate files by doing all its operations in memory, ensuring speed and efficiency.PxBLAT is scientific software, with a published paper in the BioRxiv. Check the published to read the paper.
@article{li2024pxblat,
author = {Li, Yangyang and Yang, Rendong},
title = {{PxBLAT: an efficient python binding library for BLAT}},
journal = {BMC Bioinf.},
volume = {25},
number = {1},
pages = {1--8},
year = {2024},
month = {12},
issn = {1471-2105},
publisher = {BioMed Central},
doi = {10.1186/s12859-024-05844-0}
}
Welcome to PxBLAT! To kickstart your journey and get the most out of this tool, we have prepared a comprehensive documentation. Inside, youโll find detailed guides, examples, and all the necessary information to help you navigate and utilize PxBLAT effectively.
If you encounter any issues or if something is not clear in the documentation, do not hesitate to open an issue. We are here to help and appreciate your feedback for improving PxBLAT.
If PxBLAT has been beneficial to your projects or you appreciate the work put into it, consider leaving a โญ๏ธ Star on our GitHub repository. Your support means the world to us and motivates us to continue enhancing PxBLAT.
Letโs embark on this journey together and make the most out of PxBLAT! ๐ Please see the document for details and more examples.
Contributions are always welcome! Please follow these steps:
new-feature-branch
or bugfix-issue-123
).git checkout -b new-feature-branch
poetry install
pytest -vlsx tests
git commit -m 'Implemented new feature.'
git push origin new-feature-branch
Create a pull request to the original repository. Open a new pull request to the original project repository. In the pull request, describe the changes you've made and why they're necessary. The project maintainers will review your changes and provide feedback or merge them into the main branch.
PxBLAT is modified from blat, the license is the same as blat. The source code and executables are freely available for academic, nonprofit, and personal use. Commercial licensing information is available on the Kent Informatics website (https://kentinformatics.com/).
yangliz5 ๐ง |
Joshua Zhuang ๐ |