jtleek / svaseq

Analysis for svaseq paper
19 stars 11 forks source link

Apply surrogate variables learned from some tissues to another tissue #2

Closed qwangmsk closed 8 years ago

qwangmsk commented 8 years ago

Dear Jeff,

Sorry to bother you again. I have some RNA-seq samples from two batches. For batch 1, I have tissues A, B, and C. For batch 2, I have tissues A and B only. I want to examine expression of some genes in all the samples. I wonder if it is possible to learn surrogate variables (or batch effect) from tissues A, B (from both batches) and then apply them to tissue C to remove batch bias. If svaseq can do it, which function or procedure you would suggest to use? Thanks in advance.

Best, Qingguo

jtleek commented 8 years ago

Hi Qinqguo

This is not exactly the case sva/svaseq is designed for. We have done some work on "predicting" the batch effects in new samples, but have shown that across known batches this can have limited effectiveness:

http://www.ncbi.nlm.nih.gov/pubmed/25332844

You would need to apply the fsva function to the log2(counts + 1) transformed data, then you could try to predict the batches for tissue C.

A more direct analysis would just combine all of the batches/tissue types and apply svaseq to remove batches.

Best

Jeff

On Wed, Nov 4, 2015 at 11:56 AM qwangmsk notifications@github.com wrote:

Dear Jeff,

Sorry to bother you again. I have some RNA-seq samples from two batches. For batch 1, I have tissues A, B, and C. For batch 2, I have tissues A and B only. I want to examine expression of some genes in all the samples. I wonder if it is possible to learn surrogate variables (or batch effect) from tissues A, B (from both batches) and then apply them to tissue C to remove batch bias. If svaseq can do it, which function or procedure you would suggest to use? Thanks in advance.

Best, Qingguo

— Reply to this email directly or view it on GitHub https://github.com/jtleek/svaseq/issues/2.

qwangmsk commented 8 years ago

Hi Jeff,

Thanks for your reply. Because tissues A, B, and C are very different, if I combine all of the batches/tissue types to run svaseq, is there risk that tissue-specific signals be cancelled out? How to avoid the risk if it is the case?

Thanks again, Qingguo

jtleek commented 8 years ago

Hi Qinguo

Include the tissue variable as one of the "protected" variables in the mod/mod0 matrices when you run sva.

Best

Jeff

On Thu, Nov 5, 2015 at 9:12 AM qwangmsk notifications@github.com wrote:

Hi Jeff,

Thanks for your reply. Because tissues A, B, and C are very different, if I combine all of the batches/tissue types to run svaseq, is there risk that tissue-specific signals be cancelled out? How to avoid the risk if it is the case?

Thanks again, Qingguo

— Reply to this email directly or view it on GitHub https://github.com/jtleek/svaseq/issues/2#issuecomment-154065921.

qwangmsk commented 8 years ago

OK. I will give it a try. Thanks so much!

Qingguo