datasnakes / OrthoEvolution

An easy to use and comprehensive python package which aids in the analysis and visualization of orthologous genes. 🐵
https://orthoevolution.readthedocs.io/en/master/
29 stars 4 forks source link

Re-address the current ways that the various Orthologs modules handle the data/databases. #106

Open grabear opened 6 years ago

grabear commented 6 years ago

Problem:

When using CompGenObjects, the initial runtime for a full file can take up to 5 minutes.

Solution:

In order to speed this up, it would be good to add the entire csv file to a database as is. After CompGenObjects has created its pre/post blast dictionaries, it would be good to store these there too. The duplicates dictionary takes the longest amount of time.

grabear commented 5 years ago

We are already using sqlite databases.