JackieHanLab / TOSICA

Transformer for One-Stop Interpretable Cell-type Annotation
MIT License
121 stars 23 forks source link

Mask matrix issues #22

Open kk5kk opened 3 months ago

kk5kk commented 3 months ago

May I ask,What are the specific differences between the several GMT files provided in the code? What data are they applicable to separately? If I want to use a new dataset on this model, how should I create a GMT file?

JiaweiChenGo commented 2 weeks ago

Thank you for your interest in TOSICA. The GMT file format is a tab delimited file format that describes different gene sets. The GMT files provided in the code were downloaded from here which are applicable to human and mouse. It is not dataset specific, only related to species and gene set annotations. So, you don't need to create a new file and you can choose GMT files (gene sets) depending on biological context or research interests.