MaayanLab / sigcom-lincs

Signature Commons LINCS Repo
3 stars 3 forks source link

Problems of downloaded data "LINCS Small Molecules Metadata" #75

Closed liuxiawei closed 11 months ago

liuxiawei commented 11 months ago

Hello, I downloaded the LINCS Small Molecules Metadata data from SigCom LICNS (Download page, LINCS Small Molecules Metadata, https://s3.amazonaws.com/lincs-dcic/sigcom-lincs-metadata/LINCS_small_molecules.tsv). However, I'm having trouble understanding some of the information, such as the "sig_count" field. I couldn't find any explanation for it on Google, SigCom LINCS, or the CLUE. Could you please help me understand what this means or guide me to a resource where I can find an explanation? Thank you for your assistance!

AviMaayan commented 11 months ago

pert_name: BRD ID is the unique Broad Institute ID followed by a common name
target: Entrez Gene Symbols of the known targets for the drug
moa: Known Mechanisms of Action for the drug
canonical_smiles: Chemical representation of the compound using SMILES annotations
inchi_key: Chemical representation of the compound using INCHI keys compound_aliases: Other names for the compound
sig_count: Number of signatures for the compound in SigCom LINCS. Signatures are defined as drug treatments of a specific cell line, at a specific concentration, and the time point when gene expression was measured after drug treatment.

liuxiawei commented 11 months ago

Thanks, AviMaayan! It helps a lot ! But I found sig_count is not consistent with the web. Like compound losartan, It was 395 in downloaded file (Download page, LINCS Small Molecules Metadata, https://s3.amazonaws.com/lincs-dcic/sigcom-lincs-metadata/LINCS_small_molecules.tsv). But in web meta search , only 328 Signature found by searching losartan. Could it be that my understanding is incorrect? Or is the inconsistency in the data due to version or other issues? 图片 图片

jeevangelista commented 11 months ago

Hi @liuxiawei, Sorry for the confusion, I've updated the small molecules files to reflect the proper counts

liuxiawei commented 11 months ago

Thanks @jeevangelista ! It changes and looks like more appropriate. I still find the compound losartan is not right. There is still a difference of 1 between their results (328 in Web, and 327 in new tsv file in total). I suspect there may still be some issues, but it does not have a significant impact for me . I am glad that I was able to confirm that my understanding is correct. Thank you for all your work.

jeevangelista commented 11 months ago

Hi @liuxiawei, the extra signature is from the consensus signatures that we built, which is a completely different dataset of consensus chemical perturbation signatures

liuxiawei commented 11 months ago

Thank you for your response. I now understand, and I believe that having a wiki for this database would definitely attract more users. Once again, thank you!