AlexsLemonade / refinebio-examples

Example workflows for refine.bio data
https://www.refine.bio
Other
11 stars 5 forks source link

Update analysis example: Switch WGCNA dataset to something that doesn't have technical replicates #373

Closed cansavvy closed 3 years ago

cansavvy commented 3 years ago

Background

https://github.com/AlexsLemonade/refinebio-examples/pull/364#issuecomment-732148897 There's some technical replicates in SRP133573.

Problem

We could deal with replicates by collapsing them, but I think this example is already pretty long and complicated as it is (even though it is an advanced topics example). I think we can switch this out for a dataset that is less complicated and then deal with the collapsing replicates issue separately.

What potential "gotchas" do we know of?

The dataset should be sufficiently large (bigger than 15) but not so large someone couldn't run it locally. For reference

What are the recommended next steps?

Step 0) After #363 and #364 are merged, this can be addressed. (Easier to take it one step at a time). Step 1) Find a suitable dataset replacment Step 2) Try running it in the notebook. If there's not an R^2 above 0.80 than probably no to that dataset. Step 3) Change module explorations -- see how the plots look. Step 4) If that dataset otherwise looks good, update all the wording and dataset descriptions.

cansavvy commented 3 years ago

There's this dataset that has a two time point variable that seems reasonable to use for our differential expression step. It also has 62 samples which should be plenty: https://www.refine.bio/experiments/SRP140558/acute-viral-bronchiolitis-pbmc

It's a bit metadata poor otherwise, but that's going to be the case for a lot of the RNA-seq datasets.

cansavvy commented 3 years ago

Another dataset with two time points: https://www.refine.bio/experiments/SRP132018/in-vitro-stimulation-of-healthy-donor-blood-with-il-3-cytokine

It has more metadata labels than that previous dataset but still has 56 samples.

cansavvy commented 3 years ago

Another dataset with two time points: https://www.refine.bio/experiments/SRP132018/in-vitro-stimulation-of-healthy-donor-blood-with-il-3-cytokine

It has more metadata labels than that previous dataset but still has 56 samples.

I looked at some more datasets, but this one seems like it should be fine so I'm going to give it a whirl.

Edit: It looks like its a 2x2 model, two time points and treatment/control. So nevermind. Will try https://www.refine.bio/experiments/SRP140558/acute-viral-bronchiolitis-pbmc now.

cansavvy commented 3 years ago

This has been wrapped up by #379