Closed alex-d13 closed 2 years ago
I have to think about it...
But I have a few quick comments: we could
I did that now with T cells CD8 as reference and only show cell-types with 10 occurrences. I am not sure about using the same y-axis ( I see the point for better visual comparison), but for example I had to remove the plasma cells from this plot, since Monaco scales them so high, that it reaches a value of 30 in this case. Then we do not see anything for the other cell-types, since the y axis reaches up to 30.
I also tried out z-scores: i used the 0-1 normalized scaling values to calculate them (is that ok?).
I feel like the plot with the reference work quite well to see how many methods correspond with the same "direction" (more/less than reference), while the zscores maybe show a little bit better those methods, which really behave different than the rest (even though for most cell-types its all over the place..)
Overall strategy: sample all reads of cells from a list of cell-ids with known cell-type
How to store reads? Suggestion by Markus: use k-mer approach (like kallisto index https://pachterlab.github.io/kallisto/manual); then use k-mers to calculate TPM values. This could increase runtime and decrease memory space.
Questions on this: