bowang-lab / scGPT

https://scgpt.readthedocs.io/en/latest/
MIT License
999 stars 192 forks source link

Clarifying few-shot and zero-shot vs from-scratch #3

Closed Yanay1 closed 1 year ago

Yanay1 commented 1 year ago

I have some questions about the preprint:

For the few-shot setting, eg for the PBMC 10k and Immune Human datasets, how was the model fine tuned (or not), using which objectives?

For the immune human dataset in figure 2 and supplementary table 2, how was the model fine tuned?

Thanks!

subercui commented 1 year ago

Fine-tuning is described in the Methods sections 4.4 and 4.5. If your question is specifically about the wording "few-shot", those results exactly refer to using the fine-tuned models. We previously intended to use "few-shot" to describe the phenomena we found that fine-tuning of pretrained scGPT constantly needed much fewer iterations than training from scratch. We now think this wording might cause misinterpretation occasionally, so we already changed it in the manuscript. We will update a new version together with other updates soon, and please stay tuned.

subercui commented 1 year ago

Zero-shot simply means we directly apply the pretrained model on the datasets for a specific task, without any further training.

Yanay1 commented 1 year ago

So to clarify: do the score for immune human in supplementary table 2 represent scores from a model that was fine tuned using supervision (cell type classification objective) on that same dataset? Do you have the scores for the zero shot model?

subercui commented 1 year ago

The model reported in the table is fine-tuned for the batch-correction integration task. While it does not use the cell type classification objective, it is still mainly finetuned in an self-supervised manner using GEP and GEPC. Please see the first paragraph in section 4.5. We didn't report the scores for the zero-shot model on this dataset. Qualitatively speaking, the fine-tuned model still clearly works better.

Yanay1 commented 1 year ago

Thanks for the clarification and very fast responses! 😄