Parameter calculation for ligand activity

Shin8a commented 10 months ago

Hi!

Thank you for the great tool! Although I understand basic ideas for parameters of ligand activity such as PCC and AUPR, would you mind sharing the code to calculate each parameter to more precisely understand what each of them indicates?

More concretely, how are settings$response for parameter calculation calculated and the meaning? Here, it is explained that settings$response indicate for a gene whether it was a target or not in the setting of interest, but I am still confused why it could be so.

I appreciate your time in advance.

csangara commented 10 months ago

Hi, the code you're probably looking for is in the function classification_evaluation_continuous_pred but I will also try to explain it a bit more.

setting$response is a vector of logicals. It is 1 if the gene is in the gene set of interest, and 0 otherwise. Therefore, this vector has the length of length(background_expressed_genes)+length(geneset_oi). This is then compared to a row in the ligand-target matrix, which contains the regulatory potential of a certain ligand for all target genes. We subset this vector to be the same length as the setting$response vector. Then the PCC and AUPR is computed against these two vectors, and that is the ligand activity for that specific ligand. We then do that for all ~1200 ligands in the ligand-target matrix.

Hope that helps, but feel free to ask more questions if it's still not clear :)

Shin8a commented 10 months ago

Thank you for your reply.

Now, my confusion is cleared! I would ask you if I have more questions.

saeyslab / nichenetr

Parameter calculation for ligand activity #221