kundajelab / genomelake

Simple and efficient access to genomic data for deep learning models.
BSD 3-Clause "New" or "Revised" License
43 stars 17 forks source link

keep fasta in memory #5

Closed jorenretel closed 6 years ago

jorenretel commented 6 years ago

Made the FastaFile object an instance variable of the fasta extractor. By doing so, the file is only opened once, on construction of the extractor instead of on every call. This significantly improves performance when this function called many times. If this is behavior is not always the preferred one, we could introduce a flag indicating which type of behavior is preferred (but for simplicity and the fact that genomelake is tailored towards machine learning, probably performance of data retrieval is almost always most important).

jisraeli commented 6 years ago

Looks great - thanks!