SchlossLab / Baxter_glne007Modeling_GenomeMed_2015

Microbiota-based model improves the sensitivity for detecting colonic lesions
MIT License
14 stars 9 forks source link

Zackular 2014 #4

Open naarkhoo opened 8 years ago

naarkhoo commented 8 years ago

I am not sure, if here is a right platform to ask this question. But I was wonder, how much the data from your lab (Zackular, et al 2014) is comparable to this cohort; Thanks again for making your papers and codes open.

pschloss commented 8 years ago

This is a great platform to ask this type of question! The Zackular data are a subset of the Baxter dataset.

naarkhoo commented 8 years ago

May I ask, if you have tried too pool these two data sets or use one as an external validation set to test the performance of the predictor model ?

pschloss commented 8 years ago

The Baxter dataset includes the Zackular dataset. Since the Zackular dataset was something of a random subset from the larger dataset and had about 20% of the samples, we opted for the leave-one-out cross validation. We felt that using 80% of the data to train on a single random subset containing 20% of the data would have had a negative bias on the evaluation of the model.

naarkhoo commented 8 years ago

I have made a PCoA of these two, trying to explain, what could be the source of variation. May I ask if those samples are re-sequenced, or it was two different sequencing runs ?

Because, I don't know which samples in the Baxter datasets are the Zackular samples - I was hoping the PCoA plot, could help me to see this overlap.

pschloss commented 8 years ago

Everything was resequenced

nbaxter13 commented 8 years ago

The overlapping samples in the Baxter dataset are from the same patients' stool samples, but they were different aliquots of those stool samples. So they were re-extracted and re-sequenced.

naarkhoo commented 8 years ago

It is an interesting data set by its own and shows which predictive taxa potentially is sensitive and alters.

pschloss commented 8 years ago

If you want the Zackular data on it's own, you can download the data and metadata from here: http://mothur.org/MicrobiomeBiomarkerCRC/