gagneurlab / drop

Pipeline to find aberrant events in RNA-Seq data, useful for diagnosis of rare disorders
MIT License
128 stars 43 forks source link

QUESTION: specifying samples for sampleExclusionMask in OUTRIDER #553

Closed Nicholas-Kron closed 3 weeks ago

Nicholas-Kron commented 1 month ago

Hi there!

Thank you for making this convenient pipeline. I am planning on running this soon and I had a question regarding setting specific parameters for OUTRIDER that I did not see in the manual or in any issues. If I have trios in my dataset, is there a way to specify these samples in the config or annotation table to be excluded from the auto-encoder fit?

From the OUTRIDER vignette:

4.4.1 Excluding samples from the autoencoder fit Since OUTRIDER expects that each sample within the population is independent of all others, replicates could mask effects specific to this sample. This is also true if trios are present in the data, where the parents can be seen as biological replicates. Here, we recommend to exclude the sample of interest or the replicates from the fitting. Later on, for all samples P-values are calculated. In this rare disease data set we know that two samples (MUC1344 and MUC1365) have the same defect. To exclude one or both of them, we can use the sampleExclusionMask function.

Thank you!

vyepez88 commented 3 weeks ago

Hi, thanks for using DROP! Indeed, we had made the sampleExclusionMask function for those kind of situations. However, recently, we also got trio data and saw that the fit of the masked samples is not optimal and leads to too many outliers. We are working on a dedicated solution to deal with trios. However, if your cohort is large enough, you should be able to run it as a whole anyway.

Nicholas-Kron commented 3 weeks ago

Ok sounds good. Thank you!