Closed frankcsquared closed 3 years ago
Phylogeny exists in GBFF files, which are read by BioPython (these files also contain a variety of other metadata). No further action is required to improve this organization on the level of single files; an example of how to read the files is shown below in this notebook.
https://drive.google.com/file/d/1Q_OcI3n-sDB_8RhjUuoSeZOtE5vIYW41/view?usp=sharing
I'm closing the issue, but I'm noting that it might be good to read through all the files and store all the phylogeny data in a dataframe (in case you want to easily subset a certain group of bacteria).
Tasks: