opentargets / issues

Issue tracker for Open Targets Platform and Open Targets Genetics Portal
https://platform.opentargets.org https://genetics.opentargets.org
Apache License 2.0
12 stars 2 forks source link

Experiments data missing in API response for STRING #1330

Closed d0choa closed 3 years ago

d0choa commented 3 years ago

After reviewing the data with @andrewhercules we concluded the "Experiments" datasource is missing in the STRING API response.

DSuveges commented 3 years ago

So, when looking at the String input file, it indeed has an experimental column:

curl -s https://stringdb-static.org/download/protein.links.detailed.v11.0/9606.protein.links.detailed.v11.0.txt.gz | gzcat | head 
protein1 protein2 neighborhood fusion cooccurence coexpression experimental database textmining combined_score
9606.ENSP00000000233 9606.ENSP00000272298 0 0 332 62 181 0 125 490
9606.ENSP00000000233 9606.ENSP00000253401 0 0 0 0 186 0 56 198
9606.ENSP00000000233 9606.ENSP00000401445 0 0 0 0 159 0 0 159
9606.ENSP00000000233 9606.ENSP00000418915 0 0 0 61 158 0 542 606
9606.ENSP00000000233 9606.ENSP00000327801 0 0 0 88 78 0 89 167
9606.ENSP00000000233 9606.ENSP00000466298 0 0 0 141 131 0 98 267
9606.ENSP00000000233 9606.ENSP00000232564 0 0 0 62 171 0 56 201
9606.ENSP00000000233 9606.ENSP00000393379 0 0 0 61 131 0 43 150
9606.ENSP00000000233 9606.ENSP00000371253 0 0 0 61 0 0 224 240

However the script in platform input support does expect to have a columns called experiments

    # The following STRING channels can be mapped to detection methods on MI onotology:
    detection_method_mapping = {
        'coexpression': {'name': 'coexpression', 'mi_id': 'MI:2231'},
        'coexpression_transferred': {'name': 'coexpression_transferred', 'mi_id': ''},
        'neighborhood': {'name': 'neighborhood', 'mi_id': 'MI:0057'},
        'neighborhood_transferred': {'name': 'neighborhood_transferred', 'mi_id': ''},
        'fusion': {'name': 'fusion', 'mi_id': 'MI:0036'},
        'homology': {'name': 'homology', 'mi_id': 'MI:2163'},
        'experiments': {'name': 'experiments', 'mi_id': 'MI:0591'},
        'experiments_transferred': {'name': 'experiments_transferred', 'mi_id': ''},
        'cooccurence': {'name': 'cooccurence', 'mi_id': 'MI:2231'},
        'database': {'name': 'database', 'mi_id': ''},
        'database_transferred': {'name': 'database_transferred', 'mi_id': ''},
        'textmining': {'name': 'textmining', 'mi_id': 'MI:0110'},
        'textmining_transferred': {'name': 'textmining_transferred', 'mi_id': ''},
    }

Its coming from the time when we used the homology expanded dataset:

curl -s https://stringdb-static.org/download/protein.links.full.v11.0/9606.protein.links.full.v11.0.txt.gz | gunzip | head
protein1 protein2 neighborhood neighborhood_transferred fusion cooccurence homology coexpression coexpression_transferred experiments experiments_transferred database database_transferred textmining textmining_transferred combined_score
9606.ENSP00000000233 9606.ENSP00000272298 0 0 0 332 0 0 62 0 181 0 0 0 125 490
9606.ENSP00000000233 9606.ENSP00000253401 0 0 0 0 0 0 0 0 186 0 0 0 56 198
9606.ENSP00000000233 9606.ENSP00000401445 0 0 0 0 0 0 0 0 160 0 0 0 0 159
9606.ENSP00000000233 9606.ENSP00000418915 0 0 0 0 0 0 61 0 158 0 0 542 0 606
9606.ENSP00000000233 9606.ENSP00000327801 0 0 0 0 0 69 61 0 78 0 0 0 89 167
9606.ENSP00000000233 9606.ENSP00000466298 0 0 0 0 0 141 0 0 131 0 0 0 98 267
9606.ENSP00000000233 9606.ENSP00000232564 0 0 0 0 0 0 62 0 171 0 0 0 56 201
9606.ENSP00000000233 9606.ENSP00000393379 0 0 0 0 0 0 61 0 131 0 0 0 43 150
9606.ENSP00000000233 9606.ENSP00000371253 0 0 0 0 0 0 61 0 0 0 0 0 224 240

I'll open a PR for this.

DSuveges commented 3 years ago

Fix merged to platform input support