MaayanLab / sigcom-lincs

Signature Commons LINCS Repo
3 stars 3 forks source link

Request Metadata for 1113059 Signatures #83

Closed Huhaoran0214 closed 2 months ago

Huhaoran0214 commented 3 months ago

Hi, I am trying to interpret the column barcode for L1000 Characteristic Direction Coefficient Tables (Level 5) on here: https://maayanlab.cloud/sigcom-lincs/#/Download

I think the column barcode is not 100% equal to the siginfo_beta.txt you provided on CLUE. Can you release a tsv file that link the barcode (EX:PAL002_PC3_XH_O01_LINC-ZNF681-4) to the sample information?

Many Thanks

jeevangelista commented 3 months ago

Hi @Huhaoran0214, the metadata of the signatures you needed has a cmap_id that they can match on the sig_id column of siginfo_beta.txt on CLUE. This is accessible via the metadata API.

Huhaoran0214 commented 3 months ago

Hi @Huhaoran0214, the metadata of the signatures you needed has a cmap_id that they can match on the sig_id column of siginfo_beta.txt on CLUE. This is accessible via the metadata API.

Thanks for reply my message!! I noticed that the Sig_id can be used to annotate the "Characteristic Direction Coefficient Tables" in Compound subset.

However, in "shRNA Perturbations" for instance, the Sig_id is not 100% match with the column name. For example, the sig_id "DER001_HA1E_96H:N19" is only partially equal to the matrix column name "DER001_HA1E_96H_N19_PGK1" (local id).

Huhaoran0214 commented 3 months ago

Hi @Huhaoran0214, the metadata of the signatures you needed has a cmap_id that they can match on the sig_id column of siginfo_beta.txt on CLUE. This is accessible via the metadata API.

Furthermore, when I trying to download and analyze the Level 5 data myself, I found there are different wells like "control vehicle H2O", "control vehicle DMSO", and "control untreated" for each perturbagen in the same plate.

So, which control MODZ score should I use to do comparisons between level 5 CP and level 5 control?

Many Thanks

jeevangelista commented 3 months ago

Hi @Huhaoran0214, the metadata of the signatures you needed has a cmap_id that they can match on the sig_id column of siginfo_beta.txt on CLUE. This is accessible via the metadata API.

Thanks for reply my message!! I noticed that the Sig_id can be used to annotate the "Characteristic Direction Coefficient Tables" in Compound subset.

However, in "shRNA Perturbations" for instance, the Sig_id is not 100% match with the column name. For example, the sig_id "DER001_HA1E_96H:N19" is only partially equal to the matrix column name "DER001_HA1E_96H_N19_PGK1" (local id).

the mapping is cmap_id to sig_id. To get the cmap_id, you can do:

url = 'https://maayanlab.cloud/sigcom-lincs/metadata-api/signatures/find'
payload = {
    "filter": {
        "where": {
            "meta.pert_type": "shRNA"
        },
        "fields":["meta.cmap_id", "meta.local_id"]
    }
}
res = requests.post(url, json=payload)
cmap_ids = [i["meta"]["cmap_id"] for i in res.json()]

This should give you a one-to-one mapping on siginfo_beta. It is also possible that there is not a 100% match because we worked on a snapshot of the data from a few years back, and CMAP team could've made changes since then.

jeevangelista commented 3 months ago

Hi @Huhaoran0214, the metadata of the signatures you needed has a cmap_id that they can match on the sig_id column of siginfo_beta.txt on CLUE. This is accessible via the metadata API.

Furthermore, when I trying to download and analyze the Level 5 data myself, I found there are different wells like "control vehicle H2O", "control vehicle DMSO", and "control untreated" for each perturbagen in the same plate.

So, which control MODZ score should I use to do comparisons between level 5 CP and level 5 control?

Many Thanks

Are you trying to do differential gene expression analysis? Then maybe you should use Level 3 data instead. Level 5 data should contain differential expression vectors computed either via (1) characteristic direction (sigcom-lincs) or (2) MODZ (clue.io version). The way we did it is we took the perturbation replicates within a batch and compared that with the rest of the instances of that batch.