Closed d0choa closed 3 years ago
So, when looking at the String input file, it indeed has an experimental
column:
curl -s https://stringdb-static.org/download/protein.links.detailed.v11.0/9606.protein.links.detailed.v11.0.txt.gz | gzcat | head
protein1 protein2 neighborhood fusion cooccurence coexpression experimental database textmining combined_score
9606.ENSP00000000233 9606.ENSP00000272298 0 0 332 62 181 0 125 490
9606.ENSP00000000233 9606.ENSP00000253401 0 0 0 0 186 0 56 198
9606.ENSP00000000233 9606.ENSP00000401445 0 0 0 0 159 0 0 159
9606.ENSP00000000233 9606.ENSP00000418915 0 0 0 61 158 0 542 606
9606.ENSP00000000233 9606.ENSP00000327801 0 0 0 88 78 0 89 167
9606.ENSP00000000233 9606.ENSP00000466298 0 0 0 141 131 0 98 267
9606.ENSP00000000233 9606.ENSP00000232564 0 0 0 62 171 0 56 201
9606.ENSP00000000233 9606.ENSP00000393379 0 0 0 61 131 0 43 150
9606.ENSP00000000233 9606.ENSP00000371253 0 0 0 61 0 0 224 240
However the script in platform input support does expect to have a columns called experiments
# The following STRING channels can be mapped to detection methods on MI onotology:
detection_method_mapping = {
'coexpression': {'name': 'coexpression', 'mi_id': 'MI:2231'},
'coexpression_transferred': {'name': 'coexpression_transferred', 'mi_id': ''},
'neighborhood': {'name': 'neighborhood', 'mi_id': 'MI:0057'},
'neighborhood_transferred': {'name': 'neighborhood_transferred', 'mi_id': ''},
'fusion': {'name': 'fusion', 'mi_id': 'MI:0036'},
'homology': {'name': 'homology', 'mi_id': 'MI:2163'},
'experiments': {'name': 'experiments', 'mi_id': 'MI:0591'},
'experiments_transferred': {'name': 'experiments_transferred', 'mi_id': ''},
'cooccurence': {'name': 'cooccurence', 'mi_id': 'MI:2231'},
'database': {'name': 'database', 'mi_id': ''},
'database_transferred': {'name': 'database_transferred', 'mi_id': ''},
'textmining': {'name': 'textmining', 'mi_id': 'MI:0110'},
'textmining_transferred': {'name': 'textmining_transferred', 'mi_id': ''},
}
Its coming from the time when we used the homology expanded dataset:
curl -s https://stringdb-static.org/download/protein.links.full.v11.0/9606.protein.links.full.v11.0.txt.gz | gunzip | head
protein1 protein2 neighborhood neighborhood_transferred fusion cooccurence homology coexpression coexpression_transferred experiments experiments_transferred database database_transferred textmining textmining_transferred combined_score
9606.ENSP00000000233 9606.ENSP00000272298 0 0 0 332 0 0 62 0 181 0 0 0 125 490
9606.ENSP00000000233 9606.ENSP00000253401 0 0 0 0 0 0 0 0 186 0 0 0 56 198
9606.ENSP00000000233 9606.ENSP00000401445 0 0 0 0 0 0 0 0 160 0 0 0 0 159
9606.ENSP00000000233 9606.ENSP00000418915 0 0 0 0 0 0 61 0 158 0 0 542 0 606
9606.ENSP00000000233 9606.ENSP00000327801 0 0 0 0 0 69 61 0 78 0 0 0 89 167
9606.ENSP00000000233 9606.ENSP00000466298 0 0 0 0 0 141 0 0 131 0 0 0 98 267
9606.ENSP00000000233 9606.ENSP00000232564 0 0 0 0 0 0 62 0 171 0 0 0 56 201
9606.ENSP00000000233 9606.ENSP00000393379 0 0 0 0 0 0 61 0 131 0 0 0 43 150
9606.ENSP00000000233 9606.ENSP00000371253 0 0 0 0 0 0 61 0 0 0 0 0 224 240
I'll open a PR for this.
Fix merged to platform input support
After reviewing the data with @andrewhercules we concluded the "Experiments" datasource is missing in the STRING API response.