Closed TdzBAS closed 9 months ago
Hi @TdzBAS ,
I agree that reading all at once is the way to go, and you can do that in processNanostringData by including each folder name in a character vector:
dat <- processNanostringData(nsFiles = c("path/to/folder", "path/to/another/folder"), ...)
As a default, it will just read in all of the RCC files without considering the batches, so if you want to handle batch effects in normalization/QC, you should include a column in your metafile (i.e. Batch = 1, 1, 1, 1, 2, 2, 2, ...)
Good luck! Let me know if you have more questions. -Caleb
Hi @calebclass,
thanks! But I am still curious how/if the batch correction takes place, if I only include a batch column in the metafile? Because I just included It, but still could see immense batch effects in my pca plot. Only after using combat, the batch correction was quiet successful. So how can nanotube deal with it?
Best, T
Hi @TdzBAS ,
The batch column itself doesn't do anything automatically, but it gives you a few options for how you can handle batch corrections.
# Design matrix including sample group and batch
design <- model.matrix(~group + batch)
limmaResults2 <- runLimmaAnalysis(dat, design = design)
2. If your data includes technical replicates across batches, normalization = "RUVIII" might work the best. See the example in the Normalization section of the vignette.
3. normalization = "RUVg" might work the best without explicitly considering your Batch column: if your first principle component identifies the batch effect, you can use the options: n.unwanted = 1, RUVg.drop = 0, to remove that PC from the data.
Good luck! Let me know if you have more questions.
-Caleb
HI @calebclass,
thanks for this nice response! This issue can be closed. Just one suggestion for enhancement: Would it be possible to extend the input data format to include xls files, in addition to the current support for .txt and .csv? This could potentially streamline certain processes. Thank you!
Best, T
Hi @TdzBAS ,
Pleasure! I agree with your suggestion.
Cheers, Caleb
Hi @calebclass,
I have three nanostring datasets, which were conducted with the same protocol. So I have different batches. should I read in each dataset separately or can I read the rcc files alltogether into the function "processnanostringdata" ? Which effect has the subsequent QC on this? Reading in them separately would require to merge them afterwards. IMHO reading everything at once with only one metafile seems to be most convenient.. But dont know if this is the right way..
Best, T