linsalrob / PhiSpy

Prediction of prophages from bacterial genomes
MIT License
70 stars 21 forks source link

Output GFF3 #17

Closed JFsanchezherrero closed 5 years ago

JFsanchezherrero commented 5 years ago

Dear @linsalrob,

Thank you very much for the recent updates and modifications.

A few months ago I forked your repo, and generate some new implementations such as python3 and gff3 output. During the last weeks I have not used it but the moment I come back to use Phispy, in a few weeks, I will start using your version, now that is in Python 3, as it would be much more up-to-date.

But I guess it would be great to include the generation of a standarized format such as gff3 as output results.

I already did it for my forked version and it is available here. https://github.com/JFsanchezherrero/PhiSpy/blob/9c31a60d28f9035f9d1dba8f1bc475e816b288e7/PhiSpy_tools/evaluation.py#L541

Example:

gff-version 3 | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ----

-- | -- | -- | -- | -- | -- | -- | -- | -- NC_002737 | PhiSpy | prophage_region | 87661 | 119067 | . | . | . | ID=pp1 NC_002737 | PhiSpy | attL | 87232 | 87245 | . | . | . | ID=pp1 NC_002737 | PhiSpy | attR | 117100 | 117113 | . | . | . | ID=pp1 NC_002737 | PhiSpy | prophage_region | 529631 | 605856 | . | . | . | ID=pp2 NC_002737 | PhiSpy | attL | 529660 | 529672 | . | . | . | ID=pp2 NC_002737 | PhiSpy | attR | 604721 | 604733 | . | . | . | ID=pp2 NC_002737 | PhiSpy | prophage_region | 778642 | 840502 | . | . | . | ID=pp3 NC_002737 | PhiSpy | attL | 777439 | 777452 | . | . | . | ID=pp3 NC_002737 | PhiSpy | attR | 840272 | 840285 | . | . | . | ID=pp3 NC_002737 | PhiSpy | prophage_region | 1191309 | 1241894 | . | . | . | ID=pp4 NC_002737 | PhiSpy | attL | 1189734 | 1189747 | . | . | . | ID=pp4 NC_002737 | PhiSpy | attR | 1243163 | 1243176 | . | . | . | ID=pp4

Please take a look and include it in your code, if not as default as a flag. I think it would provide Phispy a new feature and mucho more versatility

Thanks in advance

linsalrob commented 5 years ago

This should be handled in v3.4 prerelease that we have just created. We added a separate file to write GFF3 (part of our continuing modularization of the code). Please let us know if it doesn't work.