xunchen85 / VIcaller

A software to detect virome-wide integrations
14 stars 5 forks source link

Tophat with --no-coverage-search #8

Closed stsergbg closed 4 years ago

stsergbg commented 4 years ago

Hi!

When running VIcaller on one stage tophat suggests running itself in --no-coverage-search mode to make the process much faster. The run indeed completes much faster when this option is added to VIcaller.pl, so I wanted to ask if you tested this option before. Does this option (or can it, in principle), in your opinion, prevent VIcaller from finding/validating some viral integrations?

Sergei

xunchen85 commented 4 years ago

Thanks for your question.

I haven't tried it yet! But in principle, it won't matter regarding finding or validating viral integrations. If it really speeds it up, I may also update to use --no-coverage-search mode in our tool.

Xun

stsergbg commented 4 years ago

Yes, for a couple of samples I've tested there is a significant improvement in speed. I also wanted to ask if tophat is crucial or other aligners might be just as good for the first alignment, e.g. STAR? As I've understood from the code tophat is run on default parameters and the main workhorse in the tool is bwa.

By the way, when there is bam provided to the tool instead of fastq, as I got it, it is then converted to fastq only to be aligned afterwards with tophat again. Can this step be skipped altogether if I have a STAR-aligned bam file (to the same human reference) right from the start?

xunchen85 commented 4 years ago

Tophat is only used for detect viral fusion transcript from RNA-seq data.

You are correct, if the aligner can generate the BAM file with CIGAR column, VIcaller should be able to take the output BAM file as the input.

If the input is BAM file, it will convert the discordant reads into FASTQ and then re-align those reads using BWA. Tophat is only used for aligning FASTQ against human reference if the input is FASTQ format. In another word, if you provided a STAR-aligned BAM file, VIcaller should not re-run Tophat against the human reference.

Let me know if you found issues with it.

Best,

stsergbg commented 4 years ago

Thank you for the explanation! I've misunderstood the procedure with bam files as input, and now I tried running the tool on the already computed STAR bams - it works like a charm.