bjmt / universalmotif

Motif manipulation functions for R.
GNU General Public License v3.0
25 stars 8 forks source link

Read meme alphabet additions #7

Closed snystrom closed 4 years ago

snystrom commented 4 years ago

Addresses #6. read_meme can now parse .meme format files with DNA/RNA/AA-LIKE alphabets and throws an informative error message for custom alphabet types without a *-LIKE definition.

Adds two unit tests for both new functions and two example files for testing.

This build passes all unit tests & builds on Ubuntu. See commit messages for thoughts on possible downstream errors as a consequence of changing how alph.len is calculated. I think this is handled by throwing an error with custom alphabets, but there might be something I'm not considering.

bjmt commented 4 years ago

Wow, thanks a lot. Will merge and test, see what I can do about custom alphabets.

This patch is definitely a non-trivial amount of work. Please let me know if you have any objection to being included in the DESCRIPTION as a contributor, otherwise I will add you once I'm ready to push the patch to Bioconductor.

bjmt commented 4 years ago

Today is the feature freeze for the next Bioconductor release, so I will leave custom alphabet stuff be for now. I will revisit this once the current release process is finished. Also realized it would only be sensical to add custom alphabet support to write_meme() as well at that time.

snystrom commented 4 years ago

Thanks for including me in the DESCRIPTION!

I think the biggest issue for supporting custom alphabets is that I can't find a clear definition on meme-suite.org for how those sections should be defined, the ALPHABET section in the spec was not informative.

Regardless, I think a solution could include double-checking the alen entry once the custom alphabet was built (ie alen == alph.len). Perhaps it might be as simple as using the first alen number of letters from the alph string my custom alphabet function currently detects? I am not sure. Could also pose a question to the the QA forum for clarification on this.

snystrom commented 4 years ago

Ah, found the custom alphabet definition: here