PoonLab / OpenRDP

An open-source re-implementation of the RDP4 recombination detection program
GNU General Public License v3.0
45 stars 9 forks source link

Linear Genome Setting #71

Open paolij opened 6 months ago

paolij commented 6 months ago

Hello Art,

I successfully installed openRDP. How can I change the settings to analyze a linear genome (for viruses)? Thank you!

ArtPoon commented 5 months ago

Sorry it took me so long to get to this issue, we've been on a holiday break here. I'm not sure what you mean by linear genome - this should not be substantially different from analyzing an alignment of partial genome (i.e., gene) sequences. If you are asking about how to modify parameter settings for longer input sequences, I'd recommend referring to the original RDP documentation.

darrenpmartin commented 5 months ago

I think he is referring to the original RDP setting where sequences are analysed as being circular or linear. For circular sequences the window wraps at the ends of the sequences.

On Tue, Jan 9, 2024 at 4:25 PM Art Poon @.***> wrote:

Sorry it took me so long to get to this issue, we've been on a holiday break here. I'm not sure what you mean by linear genome - this should not be substantially different from analyzing an alignment of partial genome (i.e., gene) sequences. If you are asking about how to modify parameter settings for longer input sequences, I'd recommend referring to the original RDP documentation.

— Reply to this email directly, view it on GitHub https://github.com/PoonLab/OpenRDP/issues/71#issuecomment-1883144686, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADEJ3TTEVHGUJNVMX7VUDXDYNVHO5AVCNFSM6AAAAABAZHDGNCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOBTGE2DINRYGY . You are receiving this because you are subscribed to this thread.Message ID: @.***>

paolij commented 5 months ago

Hi Art,

No worries, hope you had a joyful holiday season!

Yes, I was wondering in OpenRDP how to specify linear genomes as opposed to default circular to analyze viral genomes.

I was able to install OpenRDP in UF's supercomputer and write a script which ran sucessfully, however I realized the output options I require for my project are not yet available in OpenRDP (for example having the program remove sections of sequences with recombination evidence from my fasta file). I will stick with the desktop RDP5 version for now.

Thank you for your help!

Best wishes,

Julia Paoli, MSc (she/her)

Mavian Lab

Emerging Pathogens Institute Department of Pathology College of Medicine University of Florida


From: Art Poon @.> Sent: Tuesday, January 9, 2024 9:25 AM To: PoonLab/OpenRDP @.> Cc: Paoli,Julia E @.>; Author @.> Subject: Re: [PoonLab/OpenRDP] Linear Genome Setting (Issue #71)

[External Email]

Sorry it took me so long to get to this issue, we've been on a holiday break here. I'm not sure what you mean by linear genome - this should not be substantially different from analyzing an alignment of partial genome (i.e., gene) sequences. If you are asking about how to modify parameter settings for longer input sequences, I'd recommend referring to the original RDP documentation.

— Reply to this email directly, view it on GitHubhttps://github.com/PoonLab/OpenRDP/issues/71#issuecomment-1883144686, or unsubscribehttps://github.com/notifications/unsubscribe-auth/BEOXQBILKQ3FEYVI2LIO6ALYNVHO3AVCNFSM6AAAAABAZHDGNCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOBTGE2DINRYGY. You are receiving this because you authored the thread.Message ID: @.***>

ArtPoon commented 5 months ago

Yes, OpenRDP does not have the complete functionality of RDP5, such as removal of recombinant sections. My plan is to have one of my lab members work on this problem, but it is not covered by any of my research funding, so I cannot assign much priority to this task.

paolij commented 5 months ago

No worries! OpenRDP is a great initiative. Looking forward to its future capabilities!

Best wishes,

Julia Paoli, MSc (she/her)

Mavian Lab

Emerging Pathogens Institute Department of Pathology College of Medicine University of Florida


From: Art Poon @.> Sent: Wednesday, January 10, 2024 12:16 PM To: PoonLab/OpenRDP @.> Cc: Paoli,Julia E @.>; Author @.> Subject: Re: [PoonLab/OpenRDP] Linear Genome Setting (Issue #71)

[External Email]

Yes, OpenRDP does not have the complete functionality of RDP5, such as removal of recombinant sections. My plan is to have one of my lab members work on this problem, but it is not covered by any of my research funding, so I cannot assign much priority to this task.

— Reply to this email directly, view it on GitHubhttps://github.com/PoonLab/OpenRDP/issues/71#issuecomment-1885268165, or unsubscribehttps://github.com/notifications/unsubscribe-auth/BEOXQBPDRBTHQ2YXTDJW2BTYN3EFNAVCNFSM6AAAAABAZHDGNCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOBVGI3DQMJWGU. You are receiving this because you authored the thread.Message ID: @.***>

ArtPoon commented 5 months ago

We don't currently provide any settings for circular versus linear genomes in the driver script. I'll have to dig into the main routines to check what the defaults really are. This might be an easy fix, or it might not.

ArtPoon commented 5 months ago

@WilliamZekaiWang can you please check which setting we are using by default?

ArtPoon commented 5 months ago

Reassigning to @wqyang421

ArtPoon commented 4 months ago

@wqyang421 can you please update this issue with your findings? Thanks!

wqyang421 commented 4 months ago

Sorry for the delay. I've reviewed the code, and it doesn't differentiate between circular and linear genomes in its current settings. I also searched the RDP methods, which focus on recombination rather than physical structure. According to the RDP4 Manual, if recombination breakpoint distributions are of interest, it would almost always be best to inform the program whether the sequences being analyzed are linear or circular. Thus, we might still need a parameter to handle that in the future.

darrenpmartin commented 4 months ago

Hi Mavis - without that setting breakpoints will be incorrectly called at the sites in circular genomes where the genome sequence has been linearised for deposit in genbank/whereaver. There will be a similar problem even when analysing linear/sub-full length genomes in that breakpoints will be called in pairs even when actual recombination events in a linear genome/sub-genome fragment could involve just a single breakpoint - i.e. one breakpoint will be incorrectly called at the beginning/ending of the alignment. These beginning/ending/circularization breakpoint calls need to be handled differently than others depending on whether the sequences are considered as linear or circular. Darren

On Tue, Feb 13, 2024 at 11:34 PM Mavis Yang @.***> wrote:

Sorry for the delay. I've reviewed the code, and it doesn't differentiate between circular and linear genomes in its current settings. I also searched the RDP methods, which focus on recombination rather than physical structure. According to the RDP4 Manual, if recombination breakpoint distributions are of interest, it would almost always be best to inform the program whether the sequences being analyzed are linear or circular. Thus, we might still need a parameter to handle that in the future.

— Reply to this email directly, view it on GitHub https://github.com/PoonLab/OpenRDP/issues/71#issuecomment-1942642361, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADEJ3TXLCF46PCTZKGJUG23YTPL6ZAVCNFSM6AAAAABAZHDGNCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNBSGY2DEMZWGE . You are receiving this because you commented.Message ID: @.***>