MaayanLab / sigcom-lincs

Signature Commons LINCS Repo
3 stars 3 forks source link

About the L1000 Similarity Matrices data #60

Closed dahw0706 closed 2 years ago

dahw0706 commented 2 years ago

Hi,

When we examined the data of L1000 Gene-Gene Similarity Matrix downloaded from L1000 Similarity Matrices section, it is noticed the value for the same gene is "0" in the matrix. For example, AKT1 to AKT1 = 0. image

If the value stands for signature similarity, it should be cosine and the value for the same gene in the matrix should be "1" instead of "0". Unless the value here stands for the reverse or 1-cosine.

We didn't find the corresponding desrciption in the data or in the paper. It would be much appreciate if anyone can help to confirm the meaning of data in gene-gene similarity matrix. Thanks!

Best, TJ

sxie04 commented 2 years ago

Hi @dahw0706, thanks for the question. The gene-gene similarity matrix is indeed based on cosine similarity, and as such the diagonal would normally consists of 1s. We manually set the diagonal to 0 here because we have a specific implementation of a different tool that uses these matrices, where we did not want these self-interactions to be included. Apologies for any confusion, and please let me know if there are any more questions.

dahw0706 commented 2 years ago

Hi, @sxie04 Thanks for your prompt explanation. It makes sense about the diagonal value as 0 now. We assumed the value in the matrix should be similarity (cosine) but just to confirm the meaning of the diagonal value. Again, thanks for the clarification!