applied-bioinformatics / An-Introduction-To-Applied-Bioinformatics

Interactive lessons in bioinformatics.
http://readIAB.org
Other
800 stars 316 forks source link

make notebooks run faster #30

Closed jairideout closed 9 years ago

jairideout commented 10 years ago

The tests take ~40 minutes via Travis, which is running through all of the notebooks. Once we hit 50 minutes, Travis will abort the tests. Most of the cells run instantly, or with little delay, but some cells take several minutes to complete. @gregcaporaso thoughts on this?

Travis also requires that there is some sort of output printed within a 10-minute window, otherwise the tests will be killed. We're currently okay, but there are some cells that are likely close to this threshold.

gregcaporaso commented 10 years ago

I can think about optimizing some of the things that are going slow. There are some long hanging fruit (it's probably about 3 cells accounting for most of the runtime, though those cells are illustrating some important ideas).

On Fri, Apr 11, 2014 at 3:34 PM, Jai Ram Rideout notifications@github.comwrote:

The tests take ~40 minutes via Travis, which is running through all of the notebooks. Once we hit 50 minutes, Travis will abort the tests. Most of the cells run instantly, or with little delay, but some cells take several minutes to complete. @gregcaporaso https://github.com/gregcaporasothoughts on this?

Travis also requires that there is some sort of output printed within a 10-minute window, otherwise the tests will be killed. We're currently okay, but there are some cells that are likely close to this threshold.

Reply to this email directly or view it on GitHubhttps://github.com/gregcaporaso/An-Introduction-To-Applied-Bioinformatics/issues/30 .

jairideout commented 10 years ago

Great, sounds good!

gregcaporaso commented 10 years ago

Most of the parts that run slow are places where I'm trying to show how the runtime of an algorithm changes with more sequences, longer sequences, etc. The bottleneck for all of them is that the Smith-Waterman/Needleman-Wunsch implementations that I'm calling are the ones that I implemented in python for IAB. When biocore's #117 is merged, I can re-work the code that uses alignments under-the-hood (msa functions, clustering functions) to call SSW, and this issue will largely be addressed.

gregcaporaso commented 9 years ago

This has mostly been addressed so closing this issue, but we desperately need a faster global aligner in scikit-bio (biocore/scikit-bio#254, biocore/scikit-bio#555)!