fabric-testbed / InformationModel

FABRIC Information Model library
MIT License
7 stars 1 forks source link

Make FIM work with latest Neo4j #125

Closed ibaldin closed 1 year ago

ibaldin commented 2 years ago

Currently FIM works with Neo4j 4.1.6, APOC 4.1.0.0, however starting somewhere with APOC 4.1.0.11, Neo4j no longer looks at Class property on edges and has some bugs importing edge properties (unless labels attribute is specified on edge, the first key/value property is always ignored). See https://community.neo4j.com/t5/neo4j-graph-platform/apoc-import-graphml-in-4-3-and-above-ignores-relationship-edge/td-p/58577

Need to figure out a solution that is backward compatible (i.e. works with models created by prior versions (?)). Perhaps add labels property to NetworkX exports.

ibaldin commented 1 year ago

Worked with latest neo4j:5.3.0 available from Docker Hub. The Docker file and entrypoint structure has changed. At least in principle there is no need to build an alternative docker with apoc and gds, because the docker entry point code can take environment variables to load necessary plugins.

It can be done like so (note that there isn't a need to specify password in a file):

docker run -d   --user=$(id -u):$(id -g)   --name=neo4j-5   --publish=7473:7473   --publish=7474:7474   --publish=7687:7687   --volume=$(pwd)/neo4j/data:/data   --volume=$(pwd)/neo4j/imports:/import -e NEO4J_AUTH=neo4j/password -e NEO4J_PLUGINS='["apoc", "gds"]'  neo4j:5.3.0-community

In practice:

The bug with loading GraphML files described here is still present.

ibaldin commented 1 year ago

One possible approach is to modify XML coming out of NetworkX to copy 'label' attribute from 'Class' property. It can be done something like this:

from lxml import etree

ns = { 'g': 'http://graphml.graphdrawing.org/xmlns'}

tree = etree.parse('simple-topo.graphml')

for e in tree.findall('/g:graph/g:edge', ns):
    print(e.tag, e.attrib)
    data = e.find('g:data', ns)
    if not e.attrib.get('label'):
      e.set('label', data.text)

tree.write('simple-mod-topo.graphml')

Notice that on import the FIM code copies Class properties to label attributes for nodes, and labels onto Class properties for relationships. But this isn't a problem here.

The code like shown above should be included into both NetworkX export and Neo4j import workflows (so as to be able to deal with older graphs).

ibaldin commented 1 year ago

Known things to deal with:

ibaldin commented 1 year ago

Code changes largely addressed in 32588fda099c38769c258107fd46f30ca9155af1

Include

  1. Changes to validation queries and other query types (replacing pattern matching with pattern generators)
  2. Streamlining/updating code that exports and imports from/to NetworkX and Neo4j to GraphML to be as uniform as possible.
  3. Updates to test cases.