icbi-lab / immune_deconvolution_benchmark

Reproducible pipeline for "Comprehensive evaluation of cell-type quantification methods for immuno-oncology", Sturm et al. 2019, https://doi.org/10.1093/bioinformatics/btz363
https://icbi-lab.github.io/immune_deconvolution_benchmark
BSD 3-Clause "New" or "Revised" License
41 stars 14 forks source link

Should we represent the results from the tree validation datasets (Hoek, Racle, Schelker) independently or combine them? #2

Closed grst closed 6 years ago

grst commented 6 years ago

Should we represent the results from the tree validation datasets (Hoek, Racle, Schelker) independently or combine them?

grst commented 6 years ago

@FFinotello suggested to show the performance for each validation dataset independently, as they might have different characteristics. So far, I agree.

I am just not sure what is a sensible way to do so.

In principle, there are three ways how to look at the datasets:

In the supplementary information, all three types of comparisons are shown: https://grst.github.io/immune_deconvolution_benchmark/validation-with-real-data.html

For the figure in the paper, I am currently using the between-sample comparison of all three datasets merged (different symbols represent different datasets): image

By combining the datasets I hoped to

and therefore building more meaningful correlation values.

When calculating the correlations on each dataset/cell type invdividually, the sample size is as small as 3. IMO, it does not make sense to calculate correlations for 3 values.

@FFinotello , what do you suggest?

grst commented 6 years ago

Do both, where enough values are available: image

grst commented 6 years ago

Latest version: image

(I agree the final layout can be improved!)