Sum02dean / MLG

Machine Learning in Genomics Course ETH
MIT License
3 stars 2 forks source link

#23 add reference genome parsing, downgrade python for pyfasta, fix typo #29

Closed LiineKasak closed 2 years ago

LiineKasak commented 2 years ago

Notable things:

Otherwise, think this is the best way to load sequence, afterwards can add logic due to window and other shifts

LiineKasak commented 2 years ago

thought a oneliner function doesn't need documentation, but I'll add it for good measure :) there are a lot of Ns in the ref sequence, typically at the start and end of chromosomes I think. for example the first 10 000 nucleotides in chr1 are Ns iirc. But for the given genes I doubt there will be any Ns. Just added encoding for it in case

LiineKasak commented 2 years ago

@Sum02dean I just use the built in auto generation of docstring stubs in pycharm (jetbrains premium for all uni students). this is how it works: https://www.jetbrains.com/help/pycharm/using-docstrings-to-specify-types.html