zhangrengang / TEsorter

TEsorter: an accurate and fast method to classify LTR-retrotransposons in plant genomes
https://doi.org/10.1093/hr/uhac017
GNU General Public License v3.0
85 stars 19 forks source link

python3 #10

Closed Juke34 closed 4 years ago

Juke34 commented 4 years ago

Hi, Will it be possible to re-implement the tool in python3? Did you tried 2to3 so see if it can be automatically converted in python3?

Best regards

zhangrengang commented 4 years ago

I have no plan about python3. It is more flexible to install python2.

Juke34 commented 4 years ago

I will maybe give a try

Juke34 commented 4 years ago

I have now a version in python3 working. Should I PR in your repo? If you want to stick to python2 I will fork your repo and share it through my own repo. But I think as python2 is not anymore supported you should definitely move on to python3.

zhangrengang commented 4 years ago

I am still using python2, so it is better to share it through your own repo. I will add a link to your repo. Thanks for your work.

Juke34 commented 4 years ago

I will provide a GitHub release (needed for a conda recipe), how should I version it? v1.2.1.1?

zhangrengang commented 4 years ago

I have drafted the current repo a new release as v1.2.5. You can use v1.2.5 or v1.2.5.1.

Juke34 commented 4 years ago

btw I'm not sure this line is doing what you want in TEsorter.py: if rc.bitscore > d_best_hit[rc.qseqid]:

I had to change it by: if rc.bitscore > d_best_hit[rc.qseqid].bitscore:

python3 detects that d_best_hit[rc.qseqid] is not a value but an object, I think the problem is that with py2 it doesn't complain to compare a value against an object.

Juke34 commented 4 years ago

So you can find version 1.2.5.1 python3 compliant here: https://github.com/NBISweden/TEsorter

zhangrengang commented 4 years ago

You are right. It is a bug. I have fixed it, but the result do not change, which should be because that the first hit is the best hit.

Juke34 commented 4 years ago

Is it useful

def main():
    subcmd = sys.argv[1]
    if subcmd == 'LTRlibAnn':   # hmmscan + HmmBest
        ltrlib = sys.argv[2]    # input is LTR library (fasta)
        try:
            hmmdb = sys.argv[3] # rexdb, gydb, pfam, etc.
            try: seqtype = sys.argv[4]
            except IndexError: seqtype = 'nucl'
            LTRlibAnn(ltrlib, hmmdb=hmmdb, seqtype=seqtype)
        except IndexError:
            LTRlibAnn(ltrlib)
    elif subcmd == 'HmmBest':
        inSeq = sys.argv[2]          # input: LTR library (translated protein)
        prefix = inSeq
        inHmmouts = sys.argv[3:]     # input: hmmscan output (inSeq search against hmmdb)
        hmm2best(inSeq, inHmmouts, prefix)
    elif subcmd == 'Classifier':
        gff = sys.argv[2]       # input: gff3 output by LTRlibAnn or HmmBest
        try: db = sys.argv[3]   # rexdb or gydb
        except IndexError: db = 'rexdb'
        for line in Classifier(gff, db=db):
            continue
    elif subcmd == 'replaceCls':    # LTRlibAnn + Classifier
        ltrlib = sys.argv[2]        # input: LTR library (nucl fasta)
        replaceCls(ltrlib)
    elif subcmd == 'replaceClsLR':
        genome = sys.argv[2]        # input: genome input for LTR_retriever pipeline
        Retriever(genome).re_classify()
    else:
        raise ValueError('Unknown command: {}'.format(subcmd))

in TEsorter.py ? It cannot be invoked like it is right?

zhangrengang commented 4 years ago

It is not used in this version. It is for testing.