dhimmel / learn

Machine learning and feature extraction for the Rephetio project
https://doi.org/10.15363/thinklab.d210
4 stars 5 forks source link

neo4j import failed #11

Open maggielee1111 opened 1 year ago

maggielee1111 commented 1 year ago

Hi Daniel, I am trying to export Hetnets to neo4j by running the neo4j-import.ipynb using neo4j-community-3.5.1 under windows 10. The neo4j-import.ipynb involves creating a new instance of neo4j, modifying the configuration files, starting the neo4j server, reading a graph from a file, exporting it to neo4j, and then stopping the server. I have followed all the required steps, but unfortunately, I am facing issues that I'm unable to resolve on my own. To give you a brief overview, here is the exporting code snippet (trying to export a single hetnet.json.bz2 instead of using ‘ProcessPool’).

neo4j_bin = os.path.join('D:/integrate1111/integrate/neo4j/neo4j-community-3.5.1_rephetio-v2.0.1/', 'bin', 'neo4j.bat')
neo4j_version = 'neo4j-community-3.5.1'
db_name = 'rephetio-v2.0.1'
listen_address_0 = 7474
connector_0 = 7687
#create a Neo4j instance and start the server
neo4j_dir = create_instance(neo4j_version, db_name, listen_address_0, connector_0, overwrite=True)
hetnet_to_neo4j(path='D:/integrate1111/integrate/data/hetnet.json.bz2', neo4j_dir=neo4j_dir, listen_address=listen_address_0)
#check command
result = subprocess.run([neo4j_bin, 'start'])
if result.returncode != 0:
    print(f'Error starting Neo4j: {result.returncode}')
else:
    print('Neo4j started successfully')

It turns out that the 47031 nodes in hetnet.json.bz2 was imported successfully to neo4j. But the relationships are not. The output message is as follow:

neo4j\neo4j-community-3.5.1_rephetio-v2.0.1
Starting neo4j server with neo4j\neo4j-community-3.5.1_rephetio-v2.0.1\bin\neo4j.bat
Reading graph from [D:/integrate1111/integrate/data/hetnet.json.bz2](file:///D:/integrate1111/integrate/data/hetnet.json.bz2)
Exporting graph to neo4j at http://localhost:7474/db/data/
neo4j\neo4j-community-3.5.1_rephetio-v2.0.1 'Graph' object has no attribute 'find_one'
Stopping neo4j server
Traceback (most recent call last):
  File "C:\Users\我的电脑\AppData\Local\Temp\ipykernel_105868\3048707244.py", line 16, in hetnet_to_neo4j
    hetnetpy.neo4j.export_neo4j(graph, uri, 1000, 250)
  File "c:\anaconda\envs\myenv\Lib\site-packages\hetnetpy\neo4j.py", line 79, in export_neo4j
    source = db_graph.find_one(source_label, "identifier", edge.source.identifier)
             ^^^^^^^^^^^^^^^^^
AttributeError: 'Graph' object has no attribute 'find_one'
Neo4j started successfully

I am hoping you might be able to provide some guidance on how to resolve this issue. Any hints, tips or advice would be greatly appreciated. Thank you very much for your time and consideration!

dhimmel commented 1 year ago

Looks like the error is occurring on this line of hetnetpy.neo4j. db_graph is a py2neo.Graph object. I'd look into py2neo, perhaps their API changed such that find_one is no longer a method.

maggielee1111 commented 1 year ago

Thank you so much for your prompt and insightful response. It appears that your suggestion could indeed be the root cause of the issue I'm experiencing. The find_one function seems to have been deprecated in Py2neo v4 and onwards. a related discussion I will try to either downgrade the Py2neo or make necessary modifications to the hetnetpy.neo4j to adapt to the changes in Py2neo.

maggielee1111 commented 1 year ago

In case it could be useful, here is some feedback. I change these two lines to

source = db_graph.nodes.match(source_label, identifier=edge.source.identifier).first()
target = db_graph.nodes.match(target_label, identifier=edge.target.identifier).first()

and the hetnet.json.bz2 was imported successfully with nodes and relationships.

dhimmel commented 1 year ago

@maggielee1111 awesome! Just so we know what version this works with, can you see which py2neo version you're using?

Would you be interested in submitting your fix as a PR in hetnetpy?

maggielee1111 commented 1 year ago

Yes sure! The py2neo version is Name: py2neo Version: 2021.2.3. Though it does not look like a v4 name intuitively due to the calendar versioning nomenclature. Thank you for offering! Sure I would be more than happy to submit my fix. Please let me know if there is any further information I need to provide.

dhimmel commented 1 year ago

I would be more than happy to submit my fix

Awesome, start by opening at PR that changes those two lines and then I can review to see if we should make any additional changes!

maggielee1111 commented 1 year ago

okay! I think I just open a PR that changes these lines. Please check it out and let me know if there is any further things to be done.