yyoshiaki / VIRTUS2

A bioinformatics pipeline for viral transcriptome detection and quantification considering splicing.
Other
16 stars 6 forks source link

Question about VIRTUS2 Output #23

Open ppark123 opened 1 year ago

ppark123 commented 1 year ago

Hello, I ran VIRTUS2 with single end RNA-seqs from this GEO accession: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE155925. The resulting txt file isn't showing the virus that the sample is infected with, and instead shows other viruses. It seems like Hepatitis C virus genotype 1 is found in almost all the samples that I ran VIRTUS2 on, although none of the samples were infected with this particular virus. I'm not too sure what's going on. Any help would be appreciated!

yyoshiaki commented 1 year ago

HCV is a typical false positive because it contains polyA in the genome. In addition to the number of mapped reads, % coverage mapped on the viral genome is informative to discriminate false positives. GSE155925 is blood samples from respiratory infection donors. Viremia is not common, so it is common that you don't observe any virus from blood samples even when the donor is infected by some viruses in some organ.

ppark123 commented 1 year ago

Thank you for the response. As a follow-up question, are there any specific types of samples (e.g., tissues) that VIRTUS2 works best with?

yyoshiaki commented 1 year ago

As the positive control, the following datasets may be nice.

gailrosen commented 8 months ago

Can you explain what virusSJ.out.tab is and what the columns are? What does SJ stand for?

yyoshiaki commented 7 months ago

Sorry for my late reply, the file is the output of STAR and splice junction information. This information is not used for the VIRTUS pipeline. https://github.com/alexdobin/STAR/blob/master/doc/STARmanual.pdf