eead-csic-compbio / get_homologues

GET_HOMOLOGUES: a versatile software package for pan-genome analysis
Other
109 stars 26 forks source link

Trouble with only some .gbk files when clustering #65

Closed TommyH-Tran closed 4 years ago

TommyH-Tran commented 4 years ago

I am currently trying to use your software on a directory with 28 genbank files that I have. However, when I am simply running the BDBH clustering algorithm like this:

./get_homologues.pl -d /Users/tommytran/Desktop/Dpig_Pangenomics/genomes_gb -n 12

I get the following error on some of the .gbk files: Screen Shot 2020-08-24 at 12 54 48 PM

brunocontrerasmoreira commented 4 years ago

Hi Tommy, thanks for sharing your files. After checking your input I can say the issue is that some of those gbk files lack the section with the actual DNA sequence at the end; instead a series of contig names are shown but that's of no use to get_homologues. You can solve this by obtaining the corresponding full gbk files, including the sequence. This is described in the manual at http://eead-csic-compbio.github.io/get_homologues/manual/manual.html#SECTION00042000000000000000 , see "Complete record" in Figure1. Hope this helps, Bruno

Bruno