TencentAILabHealthcare / scTranslator

28 stars 6 forks source link

how to set the gene ID and protein ID #2

Open yinleHu opened 1 year ago

yinleHu commented 1 year ago

Thanks for developing this extremely interesting tool. I would like to inquire about how to set the gene ID and protein ID when applying scTranslator to my dataset. Additionally, I am eager to learn how to retrain scTranslator using the CITE-seq data that I have collected. If it is convenient for the author, I kindly request guidance on resolving these issues.

ElaineLIU-920 commented 6 months ago

Thank you for your interest and inquiry about scTranslator. Regarding the questions you mentioned, I am pleased to provide the following answers.

Setting gene ID and protein ID: You can set the Hugo symbol, Hugo Gene ID, and scTranslator gene ID using the mapping table we provide. Please visit the following link to access the table: https://docs.google.com/spreadsheets/d/1FCRTNTbvIF9li8-83jmodNPAB_L60boT/edit?usp=sharing&ouid=106294896402910499751&rtpof=true&sd=true

Retraining scTranslator using your CITE-seq data: This can be achieved by inference with fine-tuning. You can simply run the following code with your own dataset in the form of an h5ad file.

$ python -m torch.distributed.launch --nnodes=$HOST_NUM --node_rank=$INDEX --nproc_per_node $HOST_GPU_NUM --master_addr $CHIEF_IP --master_port 23333 \ code/stage3_fine-tune.py --epoch=100 --frac_finetune_test=0.1 --fix_set \ --pretrain_checkpoint='checkpoint/stage2_single-cell_scTranslator.pt' \ --RNA_path='dataset/test/dataset1/GSM5008737_RNA_finetune_withcelltype.h5ad' \ --Pro_path='dataset/test/dataset1/GSM5008738_protein_finetune_withcelltype.h5ad'

I hope the above information is helpful to you. If you have any other questions or concerns, please feel free to contact us. Thank you again for your interest in scTranslator.