Create a preliminary TRAPI attribute structure for returning concept cooccurrence results. This structure can be modeled after the COHD attribute structure proposed by Matt Brush.
COHD example provided by Matt Brush
Proposed Cooccurrence Attribute Structure
Proposed Node TSV
id
name
category
CHEBI:3215
bupivacaine
biolink:ChemicalEntity
PR:000031567
leucine-rich repeat-containing protein 3B
biolink:Protein
Proposed Edge TSV (Note: scroll table to see all columns)
where the ATTRIBUTE_JSON_BLOB would be JSON represented by the following YAML:
- attribute_type_id: biolink:original_knowledge_source
value: infores:text-mining-provider-cooccurrence
value_type_id: biolink:InformationResource
description: The Text Mining Provider Concept Cooccurrence KP from NCATS Translator provides cooccurrence metrics for text-mined concepts that cooccur at various levels, e.g. document, sentence, etc. in the biomedical literature.
attribute_source: infores:text-mining-provider-cooccurrence
- attribute_type_id: biolink:supporting_data_source
value: infores:pubmed
value_type_id: biolink:InformationResource
attribute_source: infores:text-mining-provider-cooccurrence
- attribute_type_id: biolink:supporting_study_result
value: tmkp:a1a1a1a1a1a1
value_type_id: biolink:DocumentLevelConceptCooccurrenceAnalysisResult
description: a single result from computing cooccurrence metrics between two concepts that cooccur at the document level
attribute_source: infores:text-mining-provider-cooccurrence
attributes:
- attribute_type_id: biolink:supporting_document ## NOT CURRENTLY IN BIOLINK
value: PMID:29085514|PMID:1236578
value_type_id: biolink:Publication
description: The documents where the concepts of this assertion were observed to cooccur at the document level.
attribute_source: infores:pubmed
- attribute_type_id: biolink:tmkp_concept1_count
value: 123
value_type_id: SIO:000794 # SIO:count
description: The number of times concept #1 was observed to occur at the document level in the documents that were processed
attribute_source: infores:text-mining-provider-cooccurrence
- attribute_type_id: biolink:tmkp_concept2_count
value: 321
value_type_id: SIO:000794 # SIO:count
description: The number of times concept #2 was observed to occur at the document level in the documents that were processed
attribute_source: infores:text-mining-provider-cooccurrence
- attribute_type_id: biolink:tmkp_concept_pair_count
value: 2
value_type_id: SIO:000794 # SIO:count
description: The number of times the concepts of this assertion were observed to cooccur at the document level in the documents that were processed
attribute_source: infores:text-mining-provider-cooccurrence
- attribute_type_id: biolink:tmkp_normalized_google_distance
value: 0.876
value_type_id: EDAM:data_1772 # EDAM:score
description: The normalized google distance score for the concepts in this assertion based on their cooccurrence in the documents that were processed
attribute_source: infores:text-mining-provider-cooccurrence
- attribute_type_id: biolink:tmkp_pointwise_mutual_information
value: 0.876
value_type_id: EDAM:data_1772 # EDAM:score
description: The pointwise mutual information score for the concepts in this assertion based on their cooccurrence in the documents that were processed
attribute_source: infores:text-mining-provider-cooccurrence
- attribute_type_id: biolink:tmkp_normalized_pointwise_mutual_information
value: 0.876
value_type_id: EDAM:data_1772 # EDAM:score
description: The normalized pointwise mutual information score for the concepts in this assertion based on their cooccurrence in the documents that were processed
attribute_source: infores:text-mining-provider-cooccurrence
- attribute_type_id: biolink:tmkp_mutual_dependence
value: 0.876
value_type_id: EDAM:data_1772 # EDAM:score
description: The mutual dependence (PMI^2) score for the concepts in this assertion based on their cooccurrence in the documents that were processed
attribute_source: infores:text-mining-provider-cooccurrence
- attribute_type_id: biolink:tmkp_normalized_pointwise_mutual_information_max
value: 0.876
value_type_id: EDAM:data_1772 # EDAM:score
description: A variant of the normalized pointwise mutual information score for the concepts in this assertion based on their cooccurrence in the documents that were processed
attribute_source: infores:text-mining-provider-cooccurrence
- attribute_type_id: biolink:tmkp_log_frequency_biased_mutual_dependence
value: 0.876
value_type_id: EDAM:data_1772 # EDAM:score
description: The log frequency biased mutual dependence score for the concepts in this assertion based on their cooccurrence in the documents that were processed
attribute_source: infores:text-mining-provider-cooccurrence
- attribute_type_id: biolink:supporting_study_result
value: tmkp:b2b2b2b2b2b2
value_type_id: biolink:SentenceLevelConceptCooccurrenceAnalysisResult
description: a single result from computing cooccurrence metrics between two concepts that cooccur at the sentence level
attribute_source: infores:text-mining-provider-cooccurrence
attributes:
[SAME ATTRIBUTES AS ABOVE]
- attribute_type_id: biolink:supporting_study_result
value: tmkp:c3c3c3c3c3c3
value_type_id: biolink:TitleLevelConceptCooccurrenceAnalysisResult
description: a single result from computing cooccurrence metrics between two concepts that cooccur in the document title
attribute_source: infores:text-mining-provider-cooccurrence
attributes:
[SAME ATTRIBUTES AS ABOVE]
- attribute_type_id: biolink:supporting_study_result
value: tmkp:d4d4d4d4d4d4
value_type_id: biolink:AbstractLevelConceptCooccurrenceAnalysisResult
description: a single result from computing cooccurrence metrics between two concepts that cooccur in the abstract
attribute_source: infores:text-mining-provider-cooccurrence
attributes:
[SAME ATTRIBUTES AS ABOVE]
Create a preliminary TRAPI attribute structure for returning concept cooccurrence results. This structure can be modeled after the COHD attribute structure proposed by Matt Brush.
COHD example provided by Matt Brush
Proposed Cooccurrence Attribute Structure
Proposed Node TSV
Proposed Edge TSV (Note: scroll table to see all columns)
ATTRIBUTE_JSON_BLOB
where the
ATTRIBUTE_JSON_BLOB
would be JSON represented by the following YAML: