linsalrob / PhiSpy

Prediction of prophages from bacterial genomes
MIT License
70 stars 20 forks source link

prophage fasta file #54

Open mujiezhang opened 3 years ago

mujiezhang commented 3 years ago

Hi, I am using PhiSpy to predict prophages from bacteria. But the prophage fasta file have been masked with Ns. You say it is trivial to convert this format into separate contigs without the Ns but it is more complex to go from separate contigs back to a single joined contig. So how can I simply convert the format into separate contigs without Ns? Or does the PhiSpy.py have the option to do that? And I have another question. I got an error like ''No bases were counted for orf {'start': 1570484, 'stop': 1570485, 'phmm': 0.0, 'peg': 'peg', 'is_phage': 0} from 1570484 to 1570485 This error is usually thrown with an exceptionally short ORF that is only a few bases. You should check this ORF and confirm it is real!" and I checked the gbk file. This gene looks like this " gene join(1570484..1570485,1..994) /locus_tag="SMAR_RS00005" /old_locus_tag="Smar_0001" /db_xref="GeneID:4907656" CDS join(1570484..1570485,1..994) /locus_tag="SMAR_RS00005" /old_locus_tag="Smar_0001" /inference="COORDINATES: similar to AA sequence:RefSeq:WP_013143107.1" /note="Derived by automated computational analysis using gene prediction method: Protein Homology." /codon_start=1 /transl_table=11 /product="TIGR00269 family protein" /protein_id="WP_052833761.1" /db_xref="GeneID:4907656" /translation="MVNCSICGRPAVYVNRISGQAYCKKHFLEYFDKKVRRTIRKYKM FSSREHIVVAVSGGKDSLSLLHYLYNLSKRVPGWKITALLIDEGIGGYRDITKKDFLR VVNELGVNYKIASFKEYLGYTLDEIVRIGREKGLPYLPCSYCGVFRRYLLNKVARDLG GTVLATAHNLDDVIQTYVMNIINNSWDKILRLAPVTGPLDHPKFVRRAKPFYEILEKE TTLYSILNNLYPKFVECPYARFNIRWMIRRQLNELEEKYPGTKYSLLRSLLRIISILS KHRDEIIQGEIKTCKVCGEPSAHEICRACLYRYELGIMREDERKIVEEVLGKKKK" " So how can I solve this problem? Sorry for my ignorance...I am a new bird...