filip-husnik / pseudofinder

Detection of pseudogene candidates in bacterial and archaeal genomes.
GNU General Public License v3.0
42 stars 16 forks source link

[prefix]_map.pdf does not appear to have input genes colored in blue #4

Closed Arkadiy-Garber closed 4 years ago

Arkadiy-Garber commented 5 years ago

Hey Filip,

I have been using this program recently, and it has been running pretty well. Thanks for this useful tool! One issue that I think you should be aware of is that the output "map.pdf" file is not really formatted the way your tutorial says it should. Hopefully, the PDF attached to this message so that you can take a look. There are no blue-colored input genes shown. However, there seems to be two rows of red-colored genes. Are these two types of psuedogenes, or are one of these rows supposed to be colored blue?

Thanks! Arkadiy

trial_pseudofinder_map.pdf

mitchso commented 5 years ago

Hi Arkadiy!

Thanks for opening this issue and #5 - these are actually both bugs that we were not aware of beforehand, so this is very useful to us!

For this particular issue, it's confusing because the inner ring of input genes depends on a simple genbank parsing function. If parsing your file is the issue, then an error should have been initially raised before the annotation began!

But, I would start by investigating what the parsing actually looks like. You could either test it yourself by seeing if the following piece of code is able to recognize your entries:

from Bio import SeqIO
for record in SeqIO.parse(handle='your/genbank/file', format='genbank'):
    print(record)

or if you are able to send me a bit of your data, I can play around with it to see what's happening.

As for the two different rows of red-colored genes, the outer ring should be pseudogenes located on the positive sense strand while the inner track would be on the negative sense strand. I will update the documentation to make this more clear.

Thank you again for raising the issues! This project is still a work in progress and we intend on doing more extensive testing/analysis before we are ready to deploy a stable release, but your usage in the meantime is appreciated.

Cheers, Mitch

Arkadiy-Garber commented 5 years ago

Hey Mitch,

Thanks for the speedy response! Glad to be of some help.

for the bit of biopython code that you posted, I was able to get it to run without any errors. However, it does not look like the parser is identifying the gene features. Rather, it seems to be reading in whole contigs. I don't really have much experience using biopython, so not sure if this is what the output should look like.

Maybe you are better suited at troubleshooting this, so I am attaching the genebank file here.

Thanks, Arkadiy

prokka1.gbk.txt