This benchmark does not benchmark a method's performance on labelling a totally new dataset, given an existing reference dataset. Instead, it takes a large dataset and then uses one batch as test holdout.
We should create a new task task_label_transfer which maps cell type labels from a reference dataset to a query dataset. This benchmark will then also have to deal with having to match different matrix shapes.
To avoid confusion, we should update the description of this task to reflect the distinction between label projection and label transfer
Hi Robrecht! Your comment brings up two questions:
This task was set up like this, because the correct labels on the test part were easily available. How would we get ground truth (or just correct labels) for label_transfer task between two datasets?
When transferring labels from one dataset to another, there can be a situation where some (or all) query cell types are not represented in the reference. I was thinking of emulating this by withholding some cell types in label_projection task. How do you think this should be handled in the label_transfer?
This benchmark does not benchmark a method's performance on labelling a totally new dataset, given an existing reference dataset. Instead, it takes a large dataset and then uses one batch as test holdout.
task_label_transfer
which maps cell type labels from a reference dataset to a query dataset. This benchmark will then also have to deal with having to match different matrix shapes.