Open Mitmischer opened 4 months ago
Hello @Mitmischer,
Thank you for working with seqlogo
.
I understand that some of the design choices can be frustrating for some users. Can you show me the workflow that you use that creates the error for you?
As an open-source project, I wholeheartedly welcome PRs so long as the code follows Google Python Style Guide. As stated on the README.md
, this project does not support MEME format, but we welcome any additions.
That said, seqlogo
was written to support BIOINF 529: Bioinformatics Concepts and Algorithms at the University of Michigan in the Department of Computational Medicine & Bioinformatics. It was intended as a purely educational tool for a very specific module of that course. That is why the rigid tolerance was set. We expected students to generate using the quickstart or similar means during their coursework. However, the tolerance could be set as an optional argument with a default value. That could allow unexpected end-users more flexibility.
If this does not necessarily fit your needs, I can recommend BioPython's Bio.motifs
package for parsing motif data and Logomaker for a more robust plotting platform of sequence motifs. Both of these are better suited for production-level analysis. The tradeoff is "ease of use" for the end-user.
Describe the bug The program will fail on imperfect input data.
To Reproduce Steps to reproduce the behavior: 1) Read the following probability matrices: consensus_pwms_stripped.txt For example, consider adapting the code here (the CLI should be easy to remove)
The data itself is real(!) data, obtained from https://resources.altius.org/~jvierstra/projects/motif-clustering-v2.1beta/.
Expected behavior The tool should be more lenient. The problem is in core.py:71 Why would you ever check for 1e-9 with floating point numbers and the resulting imprecisions? 1e-2 or something similar is way more reasonable. As I understood, the check is just to make sure that the matrix is not in an completely wrong format (like transposed). This threshold still catches this case more than easily.
Desktop (please complete the following information):