biod / sambamba

Tools for working with SAM/BAM data
http://thebird.nl/blog/D_Dragon.html
GNU General Public License v2.0
558 stars 104 forks source link

Feature request - add option to sort by query name like Picard #369

Closed TimurIs closed 5 years ago

TimurIs commented 5 years ago

Hi,

I think it could be a good thing to make sambamba query-name sort be compatible with other Picard tools. I wrote a quick implementation of this in BioD/bio/bam/reader.d (based on https://github.com/samtools/htsjdk/blob/master/src/main/java/htsjdk/samtools/SAMRecordQueryNameComparator.java)

bool compareReadNames(R1, R2)(const auto ref R1 a1, const auto ref R2 a2)
    if (isBamRead!R1 && isBamRead!R2)
{
    if(a1.name == a2.name)
    {
        if(a1.is_paired() || a2.is_paired())
        {
            if(!a1.is_paired())
                return false;
            if(!a2.is_paired())
                return true;

            if(a1.is_first_of_pair() && a2.is_second_of_pair())
                return true;

            if(a1.is_second_of_pair() && a2.is_first_of_pair())
                return false;

        }

        if(a1.strand() != a2.strand())
        {
            return a1.strand() == '-' ? false : true;
        }

        if(a1.is_secondary_alignment() != a2.is_secondary_alignment())
        {
            return a2.is_secondary_alignment();
        }

        if(a1.is_supplementary() != a2.is_supplementary())
        {
            return a2.is_supplementary();
        }

        if(!a1["HI"].is_nothing)
        {
                if(a2["HI"].is_nothing)
                        return true;

                int i1 = to!int(a1["HI"]);
                int i2 = to!int(a2["HI"]);
                return i1 < i2;
        }
        else
        if(!a2["HI"].is_nothing)
                return false;
    }
    return a1.name < a2.name;
}

But I'm very new to the field of bioinformatics and I'm using the D language for the first time, so that is why I've decided not to use the Pull-request.

I'm open for any kind of collaboration as long as I can get some guidence in D language.

Thank you, Timur

pjotrp commented 5 years ago

Hi @TimurIs. I think you can make a PR no problem. I'll take a look at it when you do. You should be able to compile it and do a quick test run.

Thanks.

pjotrp commented 5 years ago

Can we expect a PR? Otherwise I'll close this issue.

TimurIs commented 5 years ago

Yes, I'll do it soon. Sorry for a delay

TimurIs commented 5 years ago

Hello @pjotrp,

I've created 2 PRs: 1 for sambamba and 1 for BioD. Check them please and let me know if I screwed something up :)

pjotrp commented 5 years ago

Added. Thanks @TimurIs