nunofonseca / irap

integrated RNA-seq Analysis Pipeline
GNU General Public License v3.0
82 stars 33 forks source link

RSEM strandness #54

Closed ghost closed 6 years ago

ghost commented 6 years ago

Hello, I would like to know if it is possible to set this option found in the manual of RSEM:

--strandedness <none|forward|reverse> This option defines the strandedness of the RNA-Seq reads. It recognizes three values: 'none', 'forward', and 'reverse'. 'none' refers to non-strand-specific protocols. 'forward' means all (upstream) reads are derived from the forward strand. 'reverse' means all (upstream) reads are derived from the reverse strand. If 'forward'/'reverse' is set, the '--norc'/'--nofw' Bowtie/Bowtie 2 option will also be enabled to avoid aligning reads to the opposite strand. For Illumina TruSeq Stranded protocols, please use 'reverse'. (Default: 'none')

using iRAP. Thanks.

Best

nunofonseca commented 6 years ago

Hi! Thank you for the question.

You can pass the --strandness and other parameters to RSEM by setting the rsem_params in the configuration file as follows rsem_params=--strandness xxx

I just pushed a commit (currently in devel branch but coming out in a new release in a few days) that will set the --strandness parameter automatically based on the _strand information passed to iRAP in the configuration file.

Does the above answer your question? Cheers.

byb121 commented 6 years ago

Hi, A followup question - does this apply to other quantification tools, such as htseq? I think it also has the strandedness option. Will it use what's in the configuration file (in the current release)?

Thanks,

nunofonseca commented 6 years ago

Yes, iRAP will invoke htseq using the strandness information provided in the configuration file (or as the -s parameter in irap_single_lib). The strandness is also passed to the aligners and quantification tools. Cheers.

freekvh commented 6 years ago

Hi Nuno,

This question got me sweating a little bit because I use the Star RSEM combination mainly.

So I re-ran some of our data analysis, for this I added

##################################################################
# Overriding/changing the parameters of the quantification methods:
# _params=options
# Example
# htseq_params= -q

rsem_params=--strandedness reverse

##################################################################

to the config file, where there used to be only the global strand setting (reverse is the recommended setting according to the help of RSEM 1.3 for rsem-calculate-expression on Truseq stranded kits).

However, the output values did not change (I'm using the original fastq file but am writing to another output folder).

Would you expect that? It's very nice that rsem will take the global parameter into account in later versions, I'm holding of on upgrading from 0.8.5.p2 until version 1 is pronounced stable.

Again my highest regards,

Freek

nunofonseca commented 6 years ago

Hi Freek,

I think that the --strandness parameter in RSEM was added recently (after the release of iRAP 0.8.5p2), hence if you are using the RSEM bundled with iRAP (version 1.2.x) it will not work.

Furthermore, I suspect that RSEM uses the strandness information when aligning with Bowtie. Since you are aligning with STAR then adding or not the strandness option to rsem may not have any effect. I'm just speculating here...

I should release iRAP 1 beta in the coming days.

Cheers.

freekvh commented 6 years ago

I was speculating the same thing indeed. Thanx for the swift reply. Good luck with the release, I'm looking forward to it!

nunofonseca commented 6 years ago

Hi @rpanero, I'm closing this issue. Please re-open it if the answer is unclear. Cheers.

freekvh commented 6 years ago

Hi Nuno,

My apologies, I realize this is a closed issue, however...

I added several options regarding rsem's strandedness to the config file (iRAP 0.8.5p2), among others:

rsem_params=--forward-prob 0
rsem_params="--forward-prob 0"
rsem-calculate-expression_params= --forward-prob 0

The options did not end op in the final logs for rsem-calculate-expression, which read the same no matter the config file entry:

rsem-calculate-expression --bam --estimate-rspd --calc-ci --no-bam-output --seed 12321 -p 8 --ci-memory 6000 --paired-end 0053_P2017BB13S10R_S14/star//0053_P2017BB13S10R_S14Lib.pe.hits.bam.trans.bam /home/genomics_scratch/iraptest/data/reference/homo_sapiens/Homo_sapiens.GRCh38.87rsem/rsemref 0053_P2017BB13S10R_S14/star/rsem//0053_P2017BB13S10R_S14Lib.pe

If I manually add the option --forward-prob 0 to the rsem-calculate-expression command it works and the values produced are slightly different (though it looks like the difference is pretty small), I judged by comparing the output file 0053_P2017BB13S10R_S14Lib.pe.genes.results produced with and without the option.

What would be the correct way to get rsem-calculate-expression to take the the --forward-prob 0 into account? If you tell me to wait for iRAP 1.0, no problem, I'll wait and re-run my data when that time comes... but perhaps this information also helps to improve 1.0.

Highest regards,

Freek.

nunofonseca commented 6 years ago

Hi Freek, Thank you for the feedback. I just checked and the rsem_params option is working as expected in 1.0.0a. Also, as I pointed above, if you provide the strand information for a given library it will be passed automatically to RSEM. Beginning of next week I should release 1.0.0b, after fixing some glitches (most related to the analysis of single cell data and report generation). Cheers.

CGPIT commented 6 years ago

Hi Nuno

Can you offer a timeline as to when the above 1.0.0b will be released please as we are very keen to use this ?

Many thanks

Adam

nunofonseca commented 5 years ago

Hi Adam,

The new release will happen before the end of next week.

Thanks,

Nuno

On 09/03/2018 02:45 PM, CGPIT wrote:

Hi Nuno

Can you offer a timeline as to when the above 1.0.0b will be released please as we are very keen to use this ?

Many thanks

Adam

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/nunofonseca/irap/issues/54#issuecomment-418120131, or mute the thread https://github.com/notifications/unsubscribe-auth/ABw9vq2nDXwEkSH1IJSYqqiqxZc0jH1gks5uXTJ4gaJpZM4S4h4b.

-- Nuno Fonseca


Q: Why is this email five sentences or less? A: http://five.sentenc.es

CGPIT commented 5 years ago

Hi Nuno

        Great news, many thanks

        Adam

—————————————————— Adam Butler Cancer, Ageing and Somatic Mutation Bioinformatics

From: Nuno Fonseca notifications@github.com Reply-To: nunofonseca/irap reply@reply.github.com Date: Wednesday, 12 September 2018 at 08:43 To: nunofonseca/irap irap@noreply.github.com Cc: Adam Butler apb@sanger.ac.uk, Comment comment@noreply.github.com Subject: Re: [nunofonseca/irap] RSEM strandness (#54)

Hi Adam,

The new release will happen before the end of next week.

Thanks,

Nuno

On 09/03/2018 02:45 PM, CGPIT wrote:

Hi Nuno

Can you offer a timeline as to when the above 1.0.0b will be released please as we are very keen to use this ?

Many thanks

Adam

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/nunofonseca/irap/issues/54#issuecomment-418120131, or mute the thread https://github.com/notifications/unsubscribe-auth/ABw9vq2nDXwEkSH1IJSYqqiqxZc0jH1gks5uXTJ4gaJpZM4S4h4b.

-- Nuno Fonseca


Q: Why is this email five sentences or less? A: http://five.sentenc.es

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/nunofonseca/irap/issues/54#issuecomment-420546145, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AHGvUeoa4fyIw8fuuT1_aWYAprz_HALvks5uaLsCgaJpZM4S4h4b.

-- The Wellcome Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE.