Two Bigwig files per sample are generated representing each strand

nf-core / rnaseq

RNA sequencing analysis pipeline using STAR, RSEM, HISAT2 or Salmon with gene/isoform counts and extensive quality control.

https://nf-co.re/rnaseq

MIT License

920 stars 708 forks source link

Two Bigwig files per sample are generated representing each strand #1275

Open drimran87 opened 7 months ago

drimran87 commented 7 months ago

Description of feature

The current rna-seq pipeline, generates two bigwig files one each for each strand. it would be great if like chipseq pipeline, it generate only one bigwig files, would be helpful when comparing chipseq and rnaseq data on IGV.

MatthiasZepper commented 7 months ago

Conversely, the strand-specific coverage information is what most people need, so I think there are good reasons to keep it like it is.

Consider using the IGV functionality Overlay Data Tracks to combine your tracks when viewing, using WiggleTools to combine them upfront or creating coverage tracks afterwards from your BAM files. You can also directly view your BAM files in IGV.

drpatelh commented 6 months ago

Yep, I agree. Most users will want to know what the coverage looks like based on the strandedness of the gene itself. We could have an implementation in the pipeline that generates all 3 bigwig files e.g. 1 for each strand and 1 combined. Be useful to know what command we would run with WiggleTools to do this with the current files created by the pipeline and we could make it an opt-in parameter to run this step.

tdsone commented 5 months ago

@drimran87 in case you don't need the processing steps downstream of the read mapping you can set --outWigStrand Unstranded to collapse the wig files into one.

Reference: https://github.com/alexdobin/STAR/blob/master/doc/STARmanual.pdf