jeffdaily / parasail-python

Python bindings for the parasail C library.
Other
87 stars 17 forks source link

Loop over list of strings #53

Closed agartland closed 4 years ago

agartland commented 4 years ago

Hi Jeff, I love that you created this library AND built easy APIs for several languages. We are incorporating it into our research and analysis pipeline for immune sequencing analysis. Thank you!

One efficiency issue we've run into is trying to compute a large number of NW alignments for a pairwise-distance matrix. With some benchmarking I've been able to prove to myself that although parasail itself is fast, having to put it inside a python for-loop really slows the whole computation down (maybe 100x). Using the pre-compiled "profile" option that parasail provides has not solved this problem (though I think its a helpful step along the way.

Firstly, I'm wondering if there is an option to do this with parasail that I've missed?

If not, do you have recommendations about how one would approach this? If we wanted a "profile_db" function that would align one query sequence against an array of strings, could this be coded once and used for all your exposed alignment functions? If so, could/should it be added to parasail-python, since I believe it would have a lot of use cases. Or if it adds too much complexity to an elegant library, should this be created separately? Cython or ctypes or does it matter?

Grateful for any advice you could offer. Thanks in advance, Andrew

jeffdaily commented 4 years ago

If you don’t need a python interface for the library calls, the C library comes with a parasail_aligner application that might be what your looking for without having to implement it yourself. If you still need to do this from python, perhaps you could write something that called the parasail_aligner app. Otherwise, I have nearly zero time to develop new features. Something like what you’re asking for would ideally be done first in the C library and then wrapped by the python wrapper.

agartland commented 4 years ago

No problem, I understand the lack of time. We'll see if we can work with the app. I think it may work well for the cases when we really need to scale. Thanks for the tips.