Sheffield-iGEM / syn-zeug

A modern toolbox for synthetic biology
https://sheffield-igem.github.io/syn-zeug/
GNU Affero General Public License v3.0
6 stars 3 forks source link

Implement New Tool: "Percent Composition" #26

Open TheLostLambda opened 2 years ago

TheLostLambda commented 2 years ago

What should this tool do? Same functionality as .count_elements() but returning percentages and capable of filtering by alphabet

Is there an existing reference implementation? A bit like the DNA Stats tool when only looking at single bases.

What are the tool's inputs? Any sequence

What is the tool's output? A HashMap<char, f64> with each char being paired to its percentage of the input sequence. The HashMap should only contain chars legal in the alphabet defined by SeqKind – including all legal chars with a 0% prevalence!

Other Implementation Details Percents can be calculated using count_elements() and len(). Filtering to a map with all and only the correct chars for a given alphabet can be done by checking self.kind and calling ByteMap::to_hashmap() with .contains(). This is blocked on #20 for now!