Closed snystrom closed 4 years ago
Wow, thanks a lot. Will merge and test, see what I can do about custom alphabets.
This patch is definitely a non-trivial amount of work. Please let me know if you have any objection to being included in the DESCRIPTION as a contributor, otherwise I will add you once I'm ready to push the patch to Bioconductor.
Today is the feature freeze for the next Bioconductor release, so I will leave custom alphabet stuff be for now. I will revisit this once the current release process is finished. Also realized it would only be sensical to add custom alphabet support to write_meme() as well at that time.
Thanks for including me in the DESCRIPTION!
I think the biggest issue for supporting custom alphabets is that I can't find a clear definition on meme-suite.org for how those sections should be defined, the ALPHABET section in the spec was not informative.
Regardless, I think a solution could include double-checking the alen
entry once the custom alphabet was built (ie alen == alph.len
). Perhaps it might be as simple as using the first alen
number of letters from the alph
string my custom alphabet function currently detects? I am not sure. Could also pose a question to the the QA forum for clarification on this.
Addresses #6.
read_meme
can now parse .meme format files with DNA/RNA/AA-LIKE alphabets and throws an informative error message for custom alphabet types without a *-LIKE definition.Adds two unit tests for both new functions and two example files for testing.
This build passes all unit tests & builds on Ubuntu. See commit messages for thoughts on possible downstream errors as a consequence of changing how alph.len is calculated. I think this is handled by throwing an error with custom alphabets, but there might be something I'm not considering.