Ran into a discrepancy between the way the integration tests work and the way the back end works.
The test data pipeline outputs nodes and edges separately for Cypress, and as KGTK edge format for the back end. The nodes and edges have the pre-KGTK semantics, where nodes were considered separate (and separately sourced) from the edges they appeared in.
Given this scenario:
edge1 that connects node1, edge has source P0, P1
edge2 that connects node1, edge has source P0, P2
The test data pipeline outputs a single node, the first, with sources P0, P1. Whereas KGTK treats the union of sources of edges involving a node as the node's sources. The latter is the correct interpretation.
Depends on work switching mowgli-etl to only output KGTK. We may want to also switch mowgli-etl to only output edges and not nodes.
Ran into a discrepancy between the way the integration tests work and the way the back end works.
The test data pipeline outputs nodes and edges separately for Cypress, and as KGTK edge format for the back end. The nodes and edges have the pre-KGTK semantics, where nodes were considered separate (and separately sourced) from the edges they appeared in.
Given this scenario: edge1 that connects node1, edge has source P0, P1 edge2 that connects node1, edge has source P0, P2
The test data pipeline outputs a single node, the first, with sources P0, P1. Whereas KGTK treats the union of sources of edges involving a node as the node's sources. The latter is the correct interpretation.
Depends on work switching mowgli-etl to only output KGTK. We may want to also switch mowgli-etl to only output edges and not nodes.