bowang-lab / scGPT

https://scgpt.readthedocs.io/en/latest/
MIT License
1.01k stars 196 forks source link

Embeddings for reference mapping without having any cell type labels #113

Closed docmab23 closed 11 months ago

docmab23 commented 11 months ago

So, I was trying to perform some cell type annotation, without having any prior cell type labels (without fine-tuning the model). Based on the the idea here : https://github.com/bowang-lab/scGPT/issues/55 , I started using Reference mapping with cellxgene atlas, but even for creating the embedding, we need the "cell_type" parameter. If that's the case, I don't understand how we can use Reference mapping without having any inital cell annotation labels.

Thanks for your help!

subercui commented 11 months ago

Hi, thank you for pointing it out! So, the cell type column was not actually used in the computation and it is just for storing the metainfo in the output. A fast walkaround is to just indicate a random column as the input argument and ignore it in the output. As you pointed out, I definitely agree that the arg should be optional. I updated this in 9e165cc . I further removed this argument entirely, and now for cell type or any other metainfo columns, you can pass them to the obs_to_save in a unified way. I think this indeed makes more sense. Please pull the new updates and find the new instructions in the tutorial notebook. Thanks for the suggestion and let me know how it works

docmab23 commented 11 months ago

Thanks for the update @subercui ! Let me try