geneontology / signor2gocam

Converting Signor pathways to GO-CAM
1 stars 2 forks source link

Break out mechanism->GO mappings to YAML file #18

Open dustine32 opened 4 years ago

dustine32 commented 4 years ago

Make a parsable YAML file for maintaining the mappings between SIGNOR mechanisms and GO terms:

# -
#   - MECHANISM
#   - MI ID
#   - GO ID
#   - RELATION
-
  - acetylation
  - MI:0192
  - GO:0016407  # acetyltransferase activity
  - RO:0002304  # causally upstream of, positive effect
-
  - binding
  - MI:0915
  - GO:0005515  # protein binding
  - RO:0002629  # directly positively regulates

This will move the mappings out of pathway_connections.py, where they are currently hard-coded:

MECHANISM_GO_MAPPING = {
    "acetylation" : "GO:0016407",
    "binding" : "GO:0005515", # protein binding
    "catalytic activity" :  "GO:0003824",
    ...
}

The code should simplify to something like:

MECHANISM_GO_MAPPING = yaml.load("metadata/signor_mechanism_go_mapping.yaml")

Also adding a "relation" field for each mapping. Note the example relation assignments above are made up by me. I believe @thomaspd will be adding this field to the same table in the paper so I can just populate from that.

dustine32 commented 4 years ago

Actually, this format's way more readable:

-
  MECHANISM: acetylation
  MI_ID: MI:0192
  GO_ID: GO:0016407
  RELATION: RO:0002304
-
  MECHANISM: binding
  MI_ID: MI:0915
  GO_ID: GO:0005515
  RELATION: RO:0002629
vtoure commented 4 years ago

@dustine32 Why is there a 'relation' mapped for each mechanism? It is not a one-to-one mapping, right? (i.e, one mechanism can have different type of causal relation) or does 'relation' refers to something else? Thanks

dustine32 commented 4 years ago

@vtoure Right, the relation column shouldn't be 1:1 for mechanisms. I just quickly added those examples for the ticket. @thomaspd was going to add a relation column to the paper table but I'm actually not sure what the values will look like. I could see the cardinality being zero-to-many with zero just meaning use the existing default causal relation logic we've already coded.

This probably relates to some of the paper doc comments:

Currently, this RELATION field is just sitting in the new YAML file, being ignored by the conversion code. We can wait to see what @thomaspd has in mind for this field and then design accordingly?