sestaton / HMMER2GO

Annotate DNA sequences for Gene Ontology terms
MIT License
40 stars 10 forks source link

Issue with getorf command #15

Closed LorenaDerezanin closed 6 years ago

LorenaDerezanin commented 6 years ago

Hi Evan,

I have some issues with the newest version of hmmer2go (HMMER2GO-0.17.7)

I've just run a test run on a small fasta file containing a unitig (short nuclear sequence used in building of contigs during the genome assembly).

hmmer2go getorf -i AIRE_splitted_UTG_reads_ab_clean.fasta -o hmmer2go_test_AIRE_ab_fasta --verbose -n 8 

It throws the following error, although it produces an output file with a detected ORF.

========== Searching for ORFs with minimum length of 80.
Can't use string ("1") as a SCALAR ref while "strict refs" in use at /home/lorke/perl5/lib/perl5/HMMER2GO/Command/getorf.pm line 112.

This is the content of the input file:

>utg7180000171672
CTTCCCAGTATAGGAACTGATGAGGCTTCAGAACAAGTGGGTGCCCACTCTAGCCAAGGT
GTAGTGAAAACCCCATCCCCTACGTGCAGGCTCCCTGGTGAGCAGGCAGTGTAGCTTGGG
GTAGCTCTCCATGTTGTAGTCCTTGGTAAGACTGCTCCAGAAGGCCCGGAGGATGGGTCG
ACCCTGAAGCAGGACCCAGGACACCAGGGAGTACATGGCCCTGTGCACCCCGGCCTTCCC
CTGGCTCACCATCGTGTCCTGAGAGACGTAGACAGCAGAGAGCTCTCCTCAAGCGTCGTT
TTCATGATGATGCATTCAACAGTGGTACAAATATTTAATTATTTTGTCAAGTATTTAGAG
TGTTGCCCGTGGATTGATAGATGGATATTTTATTCATCCCCTTGGGGAACTCAGGTTTCC
AGTAGCTCACAAAGGTAGTCGACAGATGGTAACGAAATGGACAACCGGTAAAAGTCAATC
AAAAAAGTAAATAAAAGTGAGCAATATTGCACACAATTGAAATTGCATTTAAAAAACAGC
CAGTGGTAGACATAAATAAATAAATATTACGCAATTCCACACTGTCCAGTGTGTGTGGGG
GCTGGGTTGAGGGAGAGAGAGACAGAAAGAGTCATGATGGGCCACTACCCTTGCCTGAGT
TGTTGTGTAGTCTGATGGCTGTGGGGACAAAGGTTGTTCTGTACTGTACAGGTACAGAAC
ATGAAGATGTACACAACCTATTATTGCTTGTATAATATCCCATCAAACAATGACCTTCAC
AAGAGCACACTGGCAACTAAACGTAAACACAGGTGCTTATGATATCAATCCTTGACTGAC
ATAGTTTCAAATTGGATTCGTTTTACTATTAAGGTCAAATGATGATCTGATTATAGCATG
TCAAATAAATGAAAATATTGTATTGCAACCAAAAGGTTGTAGCATGACCGTCGACGCTAC
GTTAGCCTCTTCTGGTCGGTGTTGTGTGATGGATGAGCTCCTATCCTCTTACCTTGAGCA
GCTGGTCAGTGATGATGTTTCTATCAGCCAGGCCGTGCACCAGGGGGAAGGGGTCGTCTA
CGGCCATGGCGATGTCGGTGCGCAGCTCTCTCAGCCGAGTGCGAAGGTCGGTGGTCCCAT
AAATCTCAACCCTGGACATGTTACTCTAACTCTGTGTGTAGATATCTGAGTCCCCGTAAA
TCTGAAGCATGAACATGTTACCCTGAGTCTTTTTTCGGTGTATGCAGGTCTGTTGGTGCA
TCCATACACTAAGCCTGATGGACGTTTAGTGTTGTGTGGACAAGAAGAACGGGGATCCTG
GTTTTGATGAGTTGGACGTGTCCACCCTTTTACACCCTGAACGTGTGAGGGAGTAACAGG
GAGGTTTGAGATGTGTCAATGTTGCCTGTGAGGAGGTGTGTCTTTTTATAAAAGGAAGGT
GGTGTTTTAGAGAAAAGAGTCAAATATCTGAAATAGTTTGCATCCATCTAGTCGAGTCTT
TGGAATTGCATGCGATCAAGTGTTATAAATGCGCACACAAAGACATACACACACACACAC
ACCCATACACACAGACACAATTCCCTTAACATTTCAGGCTATTAAAACATCGATAGGAAT
AAAAAGTAATATTTT

I've installed the older version of hmmer2go (HMMER2GO-0.17.2) and run the same test:

hmmer2go getorf -i AIRE_splitted_UTG_reads_ab_clean.fasta -o hmmer2go_older_version_AIRE_ab_fasta --verbose -n 8
========== Searching for ORFs with minimum length of 80.

========== 1 and 1 sequences in AIRE_splitted_UTG_reads_ab_clean.fasta.

========== 1 sequences processed with ORFs above 80.

========== 1.00 percent of sequences contain ORFs above 80.

and it went without any issues. I've checked both output files with diff and they're identical.

I've repeated the test run on additional fasta files with the older version of hmmer2go and it also went smoothly. It seems to me that the bug might have been introduced in the latest version.

All the best,

Lorena

sestaton commented 6 years ago

Hi Lorena,

Thank you for the report and for providing the test. I'll take a look and get back to this as soon as I get a chance.

Evan

sestaton commented 6 years ago

Thanks again for the report. This was just a warning BTW and would not have any effect on the results. It was caused by a change that was not handled correctly.

This is fixed in the master branch and in the latest release (v0.17.8).