Closed MarinaSci closed 1 year ago
Hi Marina,
thanks for your comment and the suggestion. To propose a specific solution, I need to double-check whether I understand everything correctly.
Are you proposing that eg in the case you had a ref file with sequences chr1
and chr2
and a BAM file from a sample called smp
, you would like to rename the seqs from chr1
to smp.1
and chr2
smp.2
, in order to simplify the subsequent analysis?
Dear Karel,
Thank you very much for getting back to me so swiftly and for taking on my recommendation.. Probably to rename the seqs from chr1 to smp.1 and chr2 smp.1.
I work with environmental/faecal samples and can have multiple infections present in a sample. In my references I have multiple genomes (nuclear or mitogenomes); let's say multiple chrs. So for a given sample that has more than 1 parasites present, it would be fantastic to get chr1 to smp.1 and chr2 smp.1.
Does it make sense? Again, very grateful for even considering such a tool!
Best regards, Marina
On Thu, 9 Feb 2023 at 00:02, Karel Břinda @.***> wrote:
Hi Marina,
thanks for your comment and the suggestion. To propose a specific solution, I need to double-check whether I understand everything correctly.
Are you proposing that eg in the case you had a ref file with sequences chr1 and chr2 and a BAM file from a sample called smp, you would like to rename the seqs from chr1 to smp.1 and chr2 smp.2, in order to simplify the subsequent analysis?
— Reply to this email directly, view it on GitHub https://github.com/karel-brinda/ococo/issues/38#issuecomment-1423396183, or unsubscribe https://github.com/notifications/unsubscribe-auth/AUSLHWCGF4I5NXHPNDJHXT3WWQXX3ANCNFSM6AAAAAAUMJMAOI . You are receiving this because you authored the thread.Message ID: @.***>
--
Best wishes,
Marina
Marina Papaiakovou, PhD candidate
Harding Distinguished Postgraduate Scholar
Department of Veterinary Medicine
University of Cambridge, Cambridge, UK
(she/her)
--
*People have different working patterns; please don’t feel obliged to act on this email outside of your own normal working hours *
In this case, the most straightforward solution would be to post-process the outputs from Ococo.
Unfortunately, it seems that the -F
parameter is unable to redirect the FASTA output to the standard output (stdout
) (I have no idea why I didn't implement this – I probably focused mainly on the VCF output).
So the way to go is:
./ococo -i test.bam -f test.fa -x ococo64 -F output.fa
seqtk seq output.fa | perl -pe 's/>chr/>smp./g'
or seqtk seq output.fa | perl -pe 's/>/>smp1./g'
(depends on how exactly you want to name the sequences)Thank you for the guidance, Karel!! Very helpful. Best wishes, Marina
On Wed, 15 Feb 2023 at 00:45, Karel Břinda @.***> wrote:
In this case, the most straightforward solution would be to post-process the outputs from Ococo.
Unfortunately, it seems that the -F parameter is unable to redirect the FASTA to the standard output (I have no idea why I didn't implement this – I probably focused mainly on the VCF output).
So the way to go is:
- First storing the FASTA onto disk, eg ./ococo -i test.bam -f test.fa -x ococo64 -F output.fa
- Converting the fasta, eg seqtk seq output.fa | perl -pe 's/>chr/>smp./g'
— Reply to this email directly, view it on GitHub https://github.com/karel-brinda/ococo/issues/38#issuecomment-1430591184, or unsubscribe https://github.com/notifications/unsubscribe-auth/AUSLHWGMUSYYTBUFXU3O3HDWXQRJ5ANCNFSM6AAAAAAUMJMAOI . You are receiving this because you authored the thread.Message ID: @.***>
--
Best wishes,
Marina
Marina Papaiakovou, PhD candidate
Harding Distinguished Postgraduate Scholar
Department of Veterinary Medicine
University of Cambridge, Cambridge, UK
(she/her)
--
*People have different working patterns; please don’t feel obliged to act on this email outside of your own normal working hours *
You are welcome!
I'll close this ticket for now as this won't be implemented as a separate feature.
I've also made a ticket for future about the possible redirection of consensus to stdout #39.
Hello - great development and thank you, very useful! Not sure how timely my comment can be and how active this section is... However, I will try! One thought I had is, when you have multiple bam files (=multiple samples) you want to extract the same consensus reference from (for subsequent phylogenetic analysis etc), then it would be best if the ococo output file had the sample or bam name on the first line after '>', as opposed to the fasta reference it came from. I hope that makes sense... Would that be a quick fix you think?
Thank you!!
featurerequest