TranslatorSRI / NodeNormalization

Service that produces Translator compliant nodes given a curie
MIT License
9 stars 6 forks source link

Remove information_content values from node_bindings in NodeNorm output #244

Open gaurav opened 6 months ago

gaurav commented 6 months ago

While PR https://github.com/TranslatorSRI/NodeNormalization/pull/231 fixed https://github.com/TranslatorSRI/NodeNormalization/issues/229, it also introduced an odd change in NodeNorm's behavior: while we were previously not including any attributes in the resulting node_bindings, the fixed code now includes information content as well as potentially other attributes in the results node bindings. Here is some example output from babel-validation:

"node_bindings": {
  "n0": [
      {
        "id": "PUBCHEM.COMPOUND:130881",
        "attributes": [
          {
            "attribute_type_id": "biolink:has_numeric_value",
            "value": 100.0,
            "value_type_id": "EDAM:data_0006",
            "original_attribute_name": "information_content"
          }
        ]
      }
    ],
    "n1": [
      {
        "id": "MONDO:0001134",
        "attributes": [
          {
            "attribute_type_id": "biolink:has_numeric_value",
            "value": 83.0,
            "value_type_id": "EDAM:data_0006",
            "original_attribute_name": "information_content"
          }
        ]
      }
  ]
}

@uhbrar has requested that attributes passed in should be kept in results.[].node_bindings but new attributes should not be added. Note that IC values are also included in theknowledge_graph -- those should continue to be included. The relevant code is linked below:

https://github.com/TranslatorSRI/NodeNormalization/blob/029f3b909bc2051cb8f091985af88bc0e61975d7/node_normalizer/normalizer.py#L96-L111