amundsen-io / amundsen

Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting with data.
https://www.amundsen.io/amundsen/
Apache License 2.0
4.39k stars 955 forks source link

Error running Neo4jCsvPublisher #2094

Closed femi-anthony closed 1 year ago

femi-anthony commented 1 year ago

I'm getting an error when trying to publish CSV data to an empty neo4j database:

        job = DefaultJob(
            conf=job_config, task=DummyTask(), publisher=Neo4jCsvPublisher()
        )
        job.launch()
    ...
neo4j.exceptions.CypherSyntaxError: {code: Neo.ClientError.Statement.SyntaxError} 
{message: Invalid constraint syntax, ON and ASSERT should not be used. 
Replace ON with FOR and ASSERT with REQUIRE. (line 2, column 1 (offset: 13))
"            CREATE CONSTRAINT ON (node:Table) ASSERT node.key IS UNIQUE"
             ^}

Version of libraries

amundsen-common==0.26.2
amundsen-databuilder==7.4.3
amundsen-rds==0.0.7
neo4j==5.5.0
neo4j-driver==4.4.10

Expected Behavior

No errors

Current Behavior

Exception occurs

neo4j.exceptions.CypherSyntaxError: {code: Neo.ClientError.Statement.SyntaxError} 
{message: Invalid constraint syntax, ON and ASSERT should not be used. 
Replace ON with FOR and ASSERT with REQUIRE. (line 2, column 1 (offset: 13))
"            CREATE CONSTRAINT ON (node:Table) ASSERT node.key IS UNIQUE"
             ^}

Possible Solution

Change

CREATE CONSTRAINT ON (node:Table) ASSERT node.key IS UNIQUE

to

CREATE CONSTRAINT FOR (node:Table) REQUIRE node.key IS UNIQUE

Steps to Reproduce

1. 2. 3. 4.

Screenshots (if appropriate)

Context

Your Environment

boring-cyborg[bot] commented 1 year ago

Thanks for opening your first issue here!

stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

kristenarmes commented 1 year ago

Hi @femi-anthony, I just took a look and it looks like this syntax is unfortunately no longer valid in Neo4j v5. Right now we don't have specific compatibility for 5 in the publisher. We at Lyft are planning to upgrade in the near future, so we will either address this then, or feel free to take a look at it yourself. The solution should either be backwards compatible or else a new publisher may be required since others are using older versions of Neo4j.

If you are interested in using a faster publisher (that will have the same issue for now) check out neo4j_csv_unwind_publisher.py, this is the one we will make compatible with v5 (or create a new version of) when we work on it.

femi-anthony commented 1 year ago

@kristenarmes - thanks for your response. Is there now a GA version of Amundsen that is compatible with Neo4j V5 ? I just overrode the class for our implementation locally to resolve the issue.

kristenarmes commented 1 year ago

As far as I'm aware there hasn't been open source development for compatibility with Neo4j v5 yet. We would like to work on this but it is not in our immediate plans, we also welcome contributors toward this goal.