neo4j-contrib / neo4j-apoc-procedures

Awesome Procedures On Cypher for Neo4j - codenamed "apoc"                     If you like it, please ★ above ⇧            
https://neo4j.com/labs/apoc
Apache License 2.0
1.71k stars 494 forks source link

Export with apoc.export.json.all with stream=true returns no data #3018

Open HEnquist opened 2 years ago

HEnquist commented 2 years ago

Expected Behavior (Mandatory)

Running CALL apoc.export.json.all(null, {stream: true}) streams useful data.

Actual Behavior (Mandatory)

Running CALL apoc.export.json.all(null, {stream: true}) completes without error but returns no data.

How to Reproduce the Problem

Python script:

from neo4j import GraphDatabase

# Modify!
DATABASE = "neo4j"
HOST = "localhost"
PORT = 7687
USER = "abc123"
PASS =  "abc123"

uri = "neo4j://{}:{}".format(HOST, PORT)
driver = GraphDatabase.driver(uri, auth=(USER, PASS))

def get_data(tx):
    # This works
    #result = tx.run("CALL apoc.export.csv.all(null, {stream: true, useTypes: true})")

    # This also works
    #result = tx.run('CALL apoc.export.json.query("MATCH (n) RETURN n", null, {stream: true})')

    # This gives just empty data
    result = tx.run("CALL apoc.export.json.all(null, {stream: true})")

    return result.data()

with driver.session(database=DATABASE) as session:
    data = session.read_transaction(get_data)
    with open(f"dump.json", "w") as f:
        for chunk in data:
            f.write(chunk["data"])

driver.close()

Simple Dataset (where it's possibile)

Any dataset.

Steps (Mandatory)

  1. Run the python script above against any database containing some data
  2. Check the new file "dump.json"
  3. The file is empty
  4. Change to one of the commented-out queries and run again.
  5. dump.json contains data.

Screenshots (where it's possibile)

Specifications (Mandatory)

Currently used versions

Versions

jexp commented 2 years ago

I tried it myself, if you run apoc.export.json.all() e.g. on sandbox it hangs forever. but if you use apoc.export.json.query("MATCH (n) RETURN n", ...) it works. Looks like a bug.

vga91 commented 2 years ago

@HEnquist This is weird, i just tried with the same apoc and neo4j version and it seems to work correctly. I tried using Neo4j Desktop and the movie graph.

Can you check if the problem is present with the latest apoc version 4.4.0.6, and (if you haven't already) with neo4j browser or something else other than python driver?

HEnquist commented 2 years ago

Thanks for looking! I tried again with neo4j 4.4.8, and apoc 4.4.0.6, and yes now it does work! Both for the movie dataset and a much larger one. It works both in Neo4j desktop and using the python script above.

HEnquist commented 2 years ago

We just updated the version in our cloud environment, but unfortunately I still get the same results. Dumping as csv works, while json doesn't. When I do the same on a local database (on my laptop) it's fine.

I'm not sure where I should be looking for clues. Do you have any suggestions for things to try, or logfiles to dig through?

vga91 commented 2 years ago

@HEnquist So most likely the error depends on the cloud environment.

Can you share some more information on the environment, that is, if it's on Docker / Aura / other? And, depending on the environment type, can you provide the neo4j.conf /docker-compose.yml file / docker run command / etc ..?

Also, when you run the apoc.export.json.all, is anything written in theneo4j.log and debug.log files?

jexp commented 2 years ago

Just tried it again and it works for me from cypher-shell in plain mode, for browser it will return too much data to the browser.

This is on the recommendations sandbox: apoc version 4.4.0.8

echo 'CALL apoc.export.json.all(null, {stream: true});' | bin/cypher-shell -a bolt://3.218.208.192:7687 -u neo4j -p <password> --format plain > test.txt
wc -l test.txt 
  195125 test.txt
ls -lh test.txt 
-rw-r--r--  1 mh  staff    49M 25 Aug 13:45 test.txt
nacnudus commented 1 year ago

I get the same fault in Docker, using neo4j:4.4.8-community.

cypher-shell "CALL apoc.export.json.all(null,{useTypes:true, stream: true}) YIELD file, nodes, relationships, properties, data RETURN file, nodes, relationships, properties, data;"

In logs/debug.log is WARN [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=288, gcTime=0, gcCount=0}

vga91 commented 9 months ago

@nacnudus @HEnquist The problem is still present?

nacnudus commented 9 months ago

@vga91 Sorry, I no longer use Neo4j.