rotary-genomics / rotary

Assembly/annotation workflow for Nanopore-based microbial genome data containing circular DNA elements
BSD 3-Clause "New" or "Revised" License
2 stars 1 forks source link

Qc before after comparison allow kbp gbp values in Total Bases Column #201

Open LeeBergstrand opened 5 days ago

LeeBergstrand commented 5 days ago

Refactor functions within qc.py for improved readability. The larger datasets showed that the readability column contained kbp, gbp, and mbp strings, not just mbp. Improve the functions within qc.py to remove all these strings within the Total Bases column and convert all values to a numeric mbp column.

Total Bases Total Bases (Mbp)
125 Kbp 0.125000
1.2 Gbp 1200.000000
20 mbp 20.000000
5 bp 0.000005
6 Mbp 6.000000
100 BP 0.000100
2 GBp 2000.000000

Total Bases = Before Conversion Total Bases (Mbp) = After Conversion