WangLab-SCSIO / Prophage_Tracer

Prophage Tracer: precisely tracing prophages in prokaryotic genomes using overlapping split-read alignment
GNU General Public License v3.0
22 stars 1 forks source link

Phage Fasta? #3

Open leannmlindsey opened 1 year ago

leannmlindsey commented 1 year ago

Hello! Thank you for your previous help. I now have a set of prophages identified by Prophage_tracer that I want to further investigate and I want to better understand the output files. I was hoping to find the candidate phages fasta sequence listed somewhere in the output files. The only fasta file that I see in the output is the *.SR.reads.fasta, and from what I can see, I think this is simply the sequence of each of the split reads? Is that correct? So I must use the candidate phage output file locations to cut out the appropriate sequence from the contigs? Would that be the correct way to get the phage sequence?

WangLab-SCSIO commented 1 year ago

Hi, you can install the tool seqkit and use the codeseqkit subseq -r attL_start:attR_end reference.fasta >prophage.fsa to extract the candidate prophage sequence. attL_start and attR_end are the predicted prophage end in the output file. If your are going to deal with many prophages. can the print all the codes in the file and run this file in bash. 'awk 'BEGIN{FS="\t";OFS="\t"} NR>1{print "seqkit subseq -r "$3":"$6" reference.fasta >"$1".fasta"}' strain1.prophage.out >jobfile' 'bash jobfile'

leannmlindsey commented 1 year ago

Thank you so much for this quick reply. I appreciate it.