Closed pikapika505 closed 9 months ago
Hello @YuliaInn
Very glad to hear that our package was of help!
For your questions:
major
column) by adjusting the labels with scores from cancer hallmark genesets.major
column contains Tumor/Normal class labels from the input dataset. The tier_0
, tier_1
, and so on is the naming convention for hierarchical data annotation utilised by Ikarus. Namely, the high-order classification (Tumor/Normal) is stored in tier_0
; lower-level classifications, such as tissue or cell type are stored in columns tier_1
and tier_2
respectively. Essentially, it is implemented to ease the use and avoid chaotic column names. In most cases, one would only utilise tier_0
and tier_1
, henceforth other columns can be safely filled with NaN
s.Elaborating on your first question, to create your own model with Ikarus, you will need to prepare an anndata
object for your dataset, hallmark correction is advised but not necessary. Then, generate new Tumor/Normal gene sets and train the classifier. You can get hints on the workflow in the tutorial's sections create gene lists, signatures, and train model.
I am closing the issue. Fill free to re-open if anything else pops up.
I was very pleased that Ikarus predicted malignant cells with 0.93 accuracy in hepatocellular carcinoma (HCC) scRNA-seq data. I am planning to use it to find malignant cells in other unannotated HCC scRNA-seq datasets. So, I thought I could simply add an annotated HCC dataset to find a new tumor gene set to make it more specific to HCC datasets.
thank you, Yulia