Open MikeDacre opened 8 years ago
I'm unsure how much maintainability there is to be gained by switching the code to pandas. Part of the problem here is the A|C|G|T
format for each SNP, since it violates the implicit pandas assumption that a single column is a single entity (one number/a string).
See Pull request #9 for an example of how much the code can be cleaned up. Some quick testing suggests that there's no major speed benefit or penalty to switching the output code to pandas.
Two questions:
This would allow for easier maintainability, as well as calculating and/or plotting statistics.