Teichlab / celltypist

A tool for semi-automatic cell type classification
https://www.celltypist.org/
MIT License
254 stars 40 forks source link

multiple models #112

Open anke-king opened 3 months ago

anke-king commented 3 months ago

I would like to train cell typist on different data sets. Should I merge the 2 data sets and train the model once or train 2 models and do the annotation twice?

ChuanXu1 commented 3 months ago

@anke-king, if you train them separately, you will get two independent models. If you want to combine them for training, you have to unify their annotations to make cell type names consistent. Both approaches are feasible (I personally prefer the former as it's quicker and it's intuitive to check the consistency of predictions from two datasets).

anke-king commented 3 months ago

Thank your for your reply! Just for clarification: I have one data set with cell types for training and a second data set with cell typest which are not in the first data set. In my target data set (which I want to annotate with my custom model) I expect to see cell typest from both data sets. So if I do the former, should I do the annotation twice and select the cell type based on the confidence score or how would I get the consensus annotation?

Thanks!!

ChuanXu1 commented 3 months ago

@anke-king, if the cell types in the first and second training datasets are totally different, you can combine them and train a single model. For the confidence scores, they are not comparable across two different models; so if you use two models, you need to inspect separately (celltypist.dotplot will be useful most times), and judge by your knowledge.