AlexsLemonade / sc-data-integration

0 stars 0 forks source link

Permute celldex ref labels and compare true cell assignment to distribution of assignments #208

Open allyhawkins opened 1 year ago

allyhawkins commented 1 year ago

We will need some way to measure the confidence in the label assignment obtained from running SingleR with a given reference. One way of doing this would be to shuffle the sample labels of the reference dataset prior to training/ identifying marker genes and classifying cell types in the test dataset. This should be done over a set number of permutations to obtain a distribution of cell type assignments for each cell in the test dataset. We can then compare the true label to the distribution of assignments to obtain a p-value.

Before we can do this we will need to figure out the following:

We will probably want to have a function that takes as input the sce object of interest and the reference data to be used. Then within that function, the permutations will be performed prior to running SingleR. We should also use parallelization whenever possible.

allyhawkins commented 1 year ago

After evaluating some of the results from SingleR, we are not immediately planning on using this for the qc report, but we may return to this at a later point.