Closed Puriney closed 4 years ago
The default cell type templates were trained by our own data. You can also train your interested cell type templates by OCLR models and input them to the argument "ct.templates".
Did you use purified cells for scRNA-seq to train the model? Do you use any public data? I did not see on which data cell type templates are based; this is why I open this issue.
The OCLR model is supervised, so if you want to train a new template, you need to prepare additional training data. The expression data we used to train the default cell type templates haven't been published. Only the trained cell type templates are included in the package.
Thank you.
Can I suggest your team come up with the process of how training the model, e.g., which sc/bulk RNA-seq for T cells are used to train with OCLR? The scCancer's cell annotation depends on this 'prior knowledge'. The pipeline itself can be a 'black-box' but the prior biological knowledge is better not.
You can use your own data to train the templates. The training scripts can refer to this:
gelnet(train.data, NULL, 0, 1)
I was wondering how you decide the gene signatures for the TME cell types? For example, which paper(s)? which dataset(s)?
I did not find a clear answer in the manuscript as quoted:
The paper cited was about the regression model but did not inform the multiple datasets used.
Related code: https://github.com/wguo-research/scCancer/blob/c6078eacc2bf1ad958886c39b15e50fee92df7e6/R/scAnnotation.R#L598