Question about the value of cls token

Thank for your amazing work. I have some questions about the value of cls token. During pretraining, the value of cls is pad_value(default is -2), while during the finetuning of integration, the value of cls is 0. Is there any special purpose in this design as the value of cls token is different between the pre training stage and the finetune stage?

https://github.com/bowang-lab/scGPT/blob/4068d67caaac1e28d56964da68e0214817e38428/examples/pretrain.py#L430-L441 https://github.com/bowang-lab/scGPT/blob/706526a76d547de4ed711fa028c99be5bdf6ad8a/scgpt/tokenizer/gene_tokenizer.py#L298-L300

During finetuning of batch integration, model work as self-supervised training. When masking the gene expression value, the value of the cls token may also be masked. But this situation will not occur during the pre training process. I want to know why the value of cls token is also likely to be masked in batch integration finetune. What is the reason for this design? https://github.com/bowang-lab/scGPT/blob/706526a76d547de4ed711fa028c99be5bdf6ad8a/scgpt/tokenizer/gene_tokenizer.py#L467-L472 https://github.com/bowang-lab/scGPT/blob/4068d67caaac1e28d56964da68e0214817e38428/scgpt/data_collator.py#L417-L422

bowang-lab / scGPT

Question about the value of cls token #186