Interpreting raw and scaled ligand activity values

noranekonobokkusu commented 11 months ago

Hi!

Thanks for a great tool! Having all these functions to generate nice plots is very helpful!

I am not sure I understand ligand activity values correctly, so I would appreciate your feedback on it.

Am I correct that the raw values (orange on your plots) correspond to enrichment of targets of a particular ligand among all genes differentially expressed in a particular receiver cell type in both directions (up and down), and these values are mirrored for the two conditions I am contrasting?
Am I correct that these raw values (for all identified ligands in a particular receiver cell type) are then Z-scored and min-max scaled to produce pink values? If they are min-max scaled (becoming strictly non-negative), I don't understand why there are negative values in multinichenet_output$prioritization_tables$ligand_activities_target_de_tbl$activity_scaled.
I am trying to get intuition for the cases when after the scaling, the "direction" of the effect visually changes (in the attached example, pairs involving ADAM17 ligand for instance, or the last two pairs): Are the up- and down-regulated values treated separately during normalization and scaling? I cannot understand how the ligand which had a very small, if any (because it looks white), enrichment of its targets among downregulated genes can have a high scaled activity in downregulated genes. Is it because other ligands had even smaller raw enrichment values among downregulated genes?

Thank you!

browaeysrobin commented 11 months ago

Hi @noranekonobokkusu

Am I correct that the raw values (orange on your plots) correspond to enrichment of targets of a particular ligand among all genes differentially expressed in a particular receiver cell type in both directions (up and down), and these values are mirrored for the two conditions I am contrasting?

Yes - to illustrate with your example: highAF up: enrichment of target genes among the upregulated genes in highAF vs WT compared to the background of all expressed genes. highAF down: enrichment of target genes among the downregulated genes in highAF vs WT compared to the background of all expressed genes.

Because you have two conditions, upregulated genes in condA vs condB will be the same as downregulated genes in condB vs condA, resulting in the same activities, and thus the mirroring effect. With multiple conditions/more complex contrasts, you don't have this mirroring effect.

Am I correct that these raw values (for all identified ligands in a particular receiver cell type) are then Z-scored and min-max scaled to produce pink values? If they are min-max scaled (becoming strictly non-negative), I don't understand why there are negative values in multinichenet_output$prioritization_tables$ligand_activities_target_de_tbl$activity_scaled.

For visualization and storage in multinichenet_output$prioritization_tables$ligand_activities_target_de_tbl$activity_scaled: raw values are only z-scored and not min-max scaled. That's why you see negative values there. Min-max scaling is only done during the prioritization process.

Are the up- and down-regulated values treated separately during normalization and scaling? I cannot understand how the ligand which had a very small, if any (because it looks white), enrichment of its targets among downregulated genes can have a high scaled activity in downregulated genes. Is it because other ligands had even smaller raw enrichment values among downregulated genes?

Up- and downregulatory values are treated the same. What you see here is indeed because other ligands had even smaller raw enrichment values among downregulated genes.

Some background about this: based on our experience with NicheNet, we know that the ranking of ligands per condition/celltype (and thus scaled activity score) is often very informative, more than the raw absolute value (this because the raw absolute value may be influenced by nr of genes in the background, presence of hub genes among DE genes etc). However, in cases with really poor absolute enrichment, you will still have a few ligands with high scaled activity values. But, I would trust these less in case of poor absolute enrichment because those are typically because of only a few target genes that are predicted downstream of those ligands.

noranekonobokkusu commented 10 months ago

Thank you @browaeysrobin for a detailed answer!

saeyslab / multinichenetr

Interpreting raw and scaled ligand activity values #33