AuslanderNoam / virnatrap

MIT License
9 stars 1 forks source link

Problem with processing scripts #3

Open jingquanlim opened 1 year ago

jingquanlim commented 1 year ago

Hi Authors,

I have tried to post-process the example-output with "run_blast_os_par.py" but failed. Any advice? I see that the original paper is now online on Nature Comms so I hope there will be more post-processing scripts avail to the users.

Btw, I tried to read the code within "align_extract_reads_script.sh" and understood that it is trying take an aligned BAM from RNA-seq aligner as input and then realign the unmapped again onto the reference genome (WGS) using bowtie2. Wouldn't this weed out some HERVs which we intend to find with virnatrap in the first place? Additionally, there are hERVs included in the reference transcriptome of some RNA-seq analysis pipelines and would render some viral reads, as mapped, and won't be curated for the subsequent virnatrap's assembly anymore. Am I overthinking or we need to be tweaking those preprocessing scripts further? Thanks!

-JQ Lim

AuslanderNoam commented 1 year ago

Hello,

The pre or post processing scripts are not a part of the viRNAtrap package. While we provide all scripts that were used in this study, we do not provide support for the pre/post processing steps. However, I can try to see if I can help you debug – what are the errors that you see and do you have all the dependencies installed? Do you have Blast installed and the paths set for run_blast_os_par.py?

For the HERV – many of those are not filtered by this step, as we saw in viRNAtrap and explained in the manuscript, however, highly conserved HERV sequences may be an issue, please see the acknowledged limitation: “Importantly, the high mutation rate of HERV prohibits most HERV sequences from aligning to the human genome in pre-processing, however, in rare cases, HERV regions that are conserved would not be identified by this approach. “

From: jingquanlim @.> Date: Tuesday, March 7, 2023 at 3:37 AM To: AuslanderLab/virnatrap @.> Cc: Subscribed @.***> Subject: [EXT] [AuslanderLab/virnatrap] Problem with processing scripts (Issue #3)

Hi Authors,

I have tried to post-process the example-output with "run_blast_os_par.py" but failed. Any advice? I see that the original paper is now online on Nature Comms so I hope there will be more post-processing scripts avail to the users.

Btw, I tried to read the code within "align_extract_reads_script.sh" and understood that it is trying take an aligned BAM from RNA-seq aligner as input and then realign the unmapped again onto the reference genome (WGS) using bowtie2. Wouldn't this weed out some HERVs which we intend to find with virnatrap in the first place? Additionally, there are hERVs included in the reference transcriptome of some RNA-seq analysis pipelines and would render some viral reads, as mapped, and won't be curated for the subsequent virnatrap's assembly anymore. Am I overthinking or we need to be tweaking those preprocessing scripts further? Thanks!

-JQ Lim

— Reply to this email directly, view it on GitHubhttps://github.com/AuslanderLab/virnatrap/issues/3, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AUMJZBAYHJNXTO3MGRDFBVLW23XS3ANCNFSM6AAAAAAVSFCRHM. You are receiving this because you are subscribed to this thread.Message ID: @.***>

NOTICE: The contents of this e-mail message and any attachments are intended solely for the addressee(s) and may contain confidential and/or legally privileged information. If you are not the intended recipient of this message or if this message has been addressed to you in error, please immediately alert the sender by reply e-mail and then delete this message and any attachments. If you are not the intended recipient, you are notified that any use, dissemination, distribution, copying, or storage of this message or any attachment is strictly prohibited.