nolanlab / citrus

Citrus Development Code
GNU General Public License v3.0
31 stars 20 forks source link

Running Citrus from script #117

Open cbligaard opened 6 years ago

cbligaard commented 6 years ago

Hi Citrus-team,

I would like to run Citrus from the command-line using an R script. I would like to cluster the data and then look for both differential abundance and expression in the same clusters. I have two conditions with 19 samples in each (basal vs. treatment).

I first set up a run using the GUI and started working from the generated runCitrus.R script. However, I am unsure about these lines:

# Make vector of conditions for analysis. If comparing two conditions, should be 
# two elements - first element is baseline condition and second is comparison condition.
conditions = colnames(fileList)[1]

Should it be like this when I have two conditions? Or should there be a difference between basal and treatment here? I have attached my current script. Any help is appreciated!

modified_runCitrus.txt

An additional question: My data is actually paired - is this something that can be included in the model? I know SAM has the 'Two class paired' response type built in already, so perhaps it would be a simple tweak for me to make?

Best, Christina

rbruggner commented 6 years ago

Hi Christina,

WRT Question 1 (Conditions): The configuration you have in your script sounds correct to me.

Multiple conditions (probably sloppy language on my part) in this context refers to different measurement conditions you have for the same sample but not the experimental endpoint that you're looking for differences in.

For example, say I have 30 patients enrolled in a study trying to identify cellular signatures associated with drug responsiveness. I collect samples from all 30 of those individuals pre-treatment, give them the drug, and then collect samples from them post treatment. 10 of those 30 people respond to the drug and the other 20 do not. The experimental endpoint I'd define for Citrus is responder / non-responder status. The conditions I'd define for citrus are "pre-treatment" and "post-treatment". Citrus would then look for associations between responder / non-responder status and the relative change in each sample's features between the conditions.

WRT Question 2 (Paired samples): I unfortunately did not build in support for paired samples but hopefully an easy tweak as you suggest? I'd recommend using the citrus script to parse data, do clustering, and build features, and then just plug the feature matrix directly in to SAM using the two-class method.

LMK if that helps.

-R

cbligaard commented 6 years ago

Hi Robert,

Thank you for the prompt reply and the explanation of the conditions. I will keep my current configuration here since I don't have any responder status.

I have also run the paired-SAM like you suggested and it seems to have worked fine! I get a lot more differentially expressed/abundant clusters using the paired setup. Could one argue that the difference is larger, if you don't need the paired information to detect it? Or is significant just significant?

// Christina

SamGG commented 6 years ago

Hi, As you know, paired design allows one to capture more subtle differences because it removes more efficiently the inter subject variation. I think the word difference alone is not meaningful in statistical tests because that difference is always related to a dispersion in order to compute a p-value. I don't answer any of your questions, but may be the following links from Nature Methods "Points of Significance" column could. Best. https://www.nature.com/articles/nmeth.2858/figures/3 https://www.nature.com/articles/nmeth.2858 https://www.nature.com/collections/qghhqm/pointsofsignificance

cbligaard commented 5 years ago

Hi again,

I now have another study for which I am uncertain about how to configure my script. The design is rather complex. I have samples over time from ~20 patients treated with the same drug. For all patients I have samples before treatment and after a single dose, and for some I have samplings after two/three/four doses.

About half the patients respond to the treatment.

Now I wish to detect

Is it possible to account for this design in Citrus via a script? Right now my fileList contains the filenames, the labels are set to the time points and I suppose the conditions would be the clinical response - or should this be swapped?

Many thanks!

Best, Christina