oxli-bio / oxli

k-mers and the like
BSD 3-Clause "New" or "Revised" License
14 stars 0 forks source link

Add histo for frequency counts #29

Closed Adamtaranto closed 1 day ago

Adamtaranto commented 4 days ago

Closes #23

New functions:

Example use:

import oxli
import pandas as pd

# Create new table
kct = oxli.KmerCountTable(ksize=3)

# Count some k-mers
kct.consume('AAAAA') # count 'AAA' x 3
kct.count('TTT') # count as revcomp 'AAA' + 1
kct.count('AAC') # count 1

# histo() yields (freq,count) tuples
histo_output = kct.histo(zero=True) # [(0, 0), (1, 1), (2, 0), (3, 0), (4, 1)]

# Create a Pandas DataFrame from the list of tuples
df = pd.DataFrame(histo_output, columns=['Frequency', 'Count'])

print(df)
# Returns:
"""
   Frequency  Count
0          0      0
1          1      1
2          2      0
3          3      0
4          4      1
"""
Adamtaranto commented 4 days ago

Min and Max don't take any args and should probably be attributes i.e. called as .min instead of .min()