Closed robobenklein closed 3 years ago
File "/home/robo/code/WorldSyntaxTree/wsyntree_collector/neo4j_collector_worker.py", line 67, in batch_insert_WSTNode
nresults = tx.run(qi, {"entries": entries})
File "/home/robo/code/WorldSyntaxTree/venv/lib/python3.8/site-packages/neo4j_driver-4.1.1-py3.8.egg/neo4j/work/transaction.py", line 118, in run
result._tx_ready_run(query, parameters, **kwparameters)
File "/home/robo/code/WorldSyntaxTree/venv/lib/python3.8/site-packages/neo4j_driver-4.1.1-py3.8.egg/neo4j/work/result.py", line 57, in _tx_ready_run
self._run(query, parameters, None, None, None, **kwparameters)
File "/home/robo/code/WorldSyntaxTree/venv/lib/python3.8/site-packages/neo4j_driver-4.1.1-py3.8.egg/neo4j/work/result.py", line 101, in _run
self._attach()
File "/home/robo/code/WorldSyntaxTree/venv/lib/python3.8/site-packages/neo4j_driver-4.1.1-py3.8.egg/neo4j/work/result.py", line 202, in _attach
self._connection.fetch_message()
File "/home/robo/code/WorldSyntaxTree/venv/lib/python3.8/site-packages/neo4j_driver-4.1.1-py3.8.egg/neo4j/io/_bolt4x1.py", line 353, in fetch_message
response.on_failure(summary_metadata or {})
File "/home/robo/code/WorldSyntaxTree/venv/lib/python3.8/site-packages/neo4j_driver-4.1.1-py3.8.egg/neo4j/io/_bolt4x1.py", line 552, in on_failure
raise Neo4jError.hydrate(**metadata)
neo4j.exceptions.TransientError: {code: Neo.TransientError.Transaction.BookmarkTimeout} {message: Database 'top1k' not up to the requested version: 113071. Latest database version is 113054}
not sure if this is related to this PR or not, or if it's just an artifact of the db being unable to keep up yet caused by running analyze over linux with 128 workers
not to be merged until https://github.com/neo4j/neo4j/issues/12686 is solved
Figured out how to a constant-time batch write with only two queries! (Python is still O(N) but N is capped at the size of a single file's nodes, which are cheap to iterate)
Since it is (theoretically) constant-time inserts now, this should mean that it's ready for analyzing all of github, please compare results in both performance and in terms of tree correctness.
I also added the
preorder
property, which is the Pre-Order series number in the depth-first tree traversal. This should allow us to develop a test suite to compare Tree Sitter trees from TreeSitterCursorIterator to data in neo4j directly. (by item to item sequence comparison)