jeffdaily / parasail-python

Python bindings for the parasail C library.
Other
90 stars 17 forks source link

Cigar string in the profile mode #12

Closed huxihao closed 6 years ago

huxihao commented 6 years ago

Hi,

I found that the cigar string from the profile alignment mode was incorrect. For example:

profile = parasail.profile_create_sat('AAA', parasail.blosum62)
parasail.nw_trace_scan_profile_sat(profile, 'AAA', 9, 1).cigar.decode

which outputs '3D3I' instead of '3='

Do you have any idea on how to fix this?

Best

jeffdaily commented 6 years ago

Thank you for bringing this to my attention. I will make sure the C library is functioning properly first so I can rule out the underlying implementation.

Curious, does the non-profile function produce the correct cigar?

Jeff

On Oct 13, 2017, at 10:52 AM, huxihao notifications@github.com wrote:

Hi,

I found that the cigar string from the profile alignment mode was incorrect. For example:

profile = parasail.profile_create_sat('AAA', parasail.blosum62) parasail.nw_trace_scan_profile_sat(profile, 'AAA', 9, 1).cigar.decode which outputs '3D3I' instead of '3='

Do you have any idea on how to fix this?

Best

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

huxihao commented 6 years ago

The non-profile function gives the correct result at least for this simple example. parasail.nw_trace_scan("AAA", "AAA", 9, 1, parasail.blosum62).cigar.decode

jeffdaily commented 6 years ago

I have reproduced this error in the C library using the following test. I took your python sample code and wrote it in C:

#include <stdio.h>
#include <stdlib.h>

#include <parasail.h>
#include <parasail/matrices/blosum62.h>

int main(int argc, char **argv)
{
    parasail_profile_t *profile = NULL;
    parasail_result_t *result = NULL;
    parasail_cigar_t *cigar = NULL;
    char *cigar_str = NULL;

    profile = parasail_profile_create_sat("AAA", 3, &parasail_blosum62);
    result = parasail_nw_trace_scan_profile_sat(profile, "AAA", 3, 9, 1);
    cigar = parasail_result_get_cigar(result, "AAA", 3, "AAA", 3, &parasail_blosum62);
    cigar_str = parasail_cigar_decode(cigar);
    printf("cigar_str='%s'\n", cigar_str);

    free(cigar_str);
    parasail_cigar_free(cigar);
    parasail_result_free(result);
    parasail_profile_free(profile);

    result = parasail_nw_trace_scan("AAA", 3, "AAA", 3, 9, 1, &parasail_blosum62);
    cigar = parasail_result_get_cigar(result, "AAA", 3, "AAA", 3, &parasail_blosum62);
    cigar_str = parasail_cigar_decode(cigar);
    printf("cigar_str='%s'\n", cigar_str);

    free(cigar_str);
    parasail_cigar_free(cigar);
    parasail_result_free(result);

    return 0;
}

Output:

cigar_str='3D3I'
cigar_str='3='

I am debugging the C library now.

jeffdaily commented 6 years ago

This is fixed in the C library now as of https://github.com/jeffdaily/parasail/commit/8ddeca2412905c7cf1a244d4b6c5a95ad5284533. That commit will become v2.0.2 of the C library as soon as the CI builds finish.

I will push out new builds for pypi, trying out new manylinux builds, soon.

In the meantime, you can perhaps build your own libparasail.so based on the latest master of the C library and replace the one that is bundled in the parasail-python wheel.

huxihao commented 6 years ago

Thanks for the great job.