kinow / kinoshita.eti.br

kinow website
https://kinoshita.eti.br
Other
4 stars 2 forks source link

Blog about IUPAC codes #47

Closed kinow closed 6 years ago

kinow commented 7 years ago

Maybe try to write a Java port of Biostrings using Apache Commons Text.

kinow commented 7 years ago

https://github.com/kinow/java-biostrings

kinow commented 6 years ago

http://www.chick.manchester.ac.uk/SiteSeer/IUPAC_codes.html

kinow commented 6 years ago

Not necessary to really blog about it.

kinow commented 6 years ago

In the end we can write the port of the basic classes without Text :-)

kinow commented 6 years ago

The draft that I started...

The R Biostrings library contains several classes to help manipulating strings
from different alphabets.

`XString` is the base class, with the following implementations:

* BString - used to store any string (B comes from "B"ig String)
* DNAString - used to store DNA sequence strings
* RNAString - used to store RNA sequence strings
* AAString - used to store Amino Acids sequence strings

You can store any string in a BString. But DNAString, RNAString, and the
AAString have a limited alphabet, defining the characters that you may use
when storing sequence strings.

 (A, C, G, T, M, R, W, S, Y, K, V, H, D, B, N, -, +, and . for the gap symbol)
* 
kinow commented 6 years ago

Using the AlphabetConverter or similar classes would be an overhead