nodestream-proj / nodestream-plugin-neo4j

Database Connector Implementation for Neo4j
2 stars 1 forks source link

APOC Procedures Warning Not Clear [BUG] #6

Open acapria opened 6 months ago

acapria commented 6 months ago

Describe the bug I was trying to download nodestream for the first time, and I got to the point of uploading my file to my graph. However, my graph was not equipped to handle nodestream as the APOC procedures were not in place for my graph. The message, pasted below in additional context was not clear.

To Reproduce Try to upload a file to a graph without implementing APOC procedures.

Expected behavior Update the warning message to be clearer.

Additional context

Exceptions in Pipeline:
    Exceptions in StepExecutor 0 (FileExtractor):
        Exception in Work Body:
              File "/Users/acapria/.pyenv/versions/3.11.8/lib/python3.11/site-packages/nodestream/pipeline/pipeline.py", line 199, in try_work_body
                await self.work_body()
              File "/Users/acapria/.pyenv/versions/3.11.8/lib/python3.11/site-packages/nodestream/pipeline/pipeline.py", line 176, in work_body
                await self.submit_object_or_die_trying(record)
              File "/Users/acapria/.pyenv/versions/3.11.8/lib/python3.11/site-packages/nodestream/pipeline/pipeline.py", line 151, in submit_object_or_die_trying
                raise ForwardProgressHalted(PRECHECK_MESSAGE)

    Exceptions in StepExecutor 1 (Interpreter):
        Exception in Work Body:
              File "/Users/acapria/.pyenv/versions/3.11.8/lib/python3.11/site-packages/nodestream/pipeline/pipeline.py", line 199, in try_work_body
                await self.work_body()
              File "/Users/acapria/.pyenv/versions/3.11.8/lib/python3.11/site-packages/nodestream/pipeline/pipeline.py", line 176, in work_body
                await self.submit_object_or_die_trying(record)
              File "/Users/acapria/.pyenv/versions/3.11.8/lib/python3.11/site-packages/nodestream/pipeline/pipeline.py", line 151, in submit_object_or_die_trying
                raise ForwardProgressHalted(PRECHECK_MESSAGE)

    Exceptions in StepExecutor 2 (GraphDatabaseWriter):
        Exception in Work Body:
              File "/Users/acapria/.pyenv/versions/3.11.8/lib/python3.11/site-packages/nodestream/pipeline/pipeline.py", line 199, in try_work_body
                await self.work_body()
              File "/Users/acapria/.pyenv/versions/3.11.8/lib/python3.11/site-packages/nodestream/pipeline/pipeline.py", line 174, in work_body
                async for index, record in enumerate_async(results):
              File "/Users/acapria/.pyenv/versions/3.11.8/lib/python3.11/site-packages/nodestream/pipeline/pipeline.py", line 23, in enumerate_async
                async for item in iterable:
              File "/Users/acapria/.pyenv/versions/3.11.8/lib/python3.11/site-packages/nodestream/pipeline/writers.py", line 24, in handle_async_record_stream
                await self.write_record(record)
              File "/Users/acapria/.pyenv/versions/3.11.8/lib/python3.11/site-packages/nodestream/databases/writer.py", line 44, in write_record
                await self.flush()
              File "/Users/acapria/.pyenv/versions/3.11.8/lib/python3.11/site-packages/nodestream/databases/writer.py", line 37, in flush
                await self.ingest_strategy.flush()
              File "/Users/acapria/.pyenv/versions/3.11.8/lib/python3.11/site-packages/nodestream/databases/debounced_ingest_strategy.py", line 79, in flush
                await self.flush_nodes_updates()
              File "/Users/acapria/.pyenv/versions/3.11.8/lib/python3.11/site-packages/nodestream/databases/query_executor_with_statistics.py", line 29, in upsert_nodes_in_bulk_with_same_operation
                await self.inner.upsert_nodes_in_bulk_with_same_operation(operation, nodes)
              File "/Users/acapria/.pyenv/versions/3.11.8/lib/python3.11/site-packages/nodestream/databases/neo4j/query_executor.py", line 57, in upsert_nodes_in_bulk_with_same_operation
                await self.execute_query_batch(batched_query)
              File "/Users/acapria/.pyenv/versions/3.11.8/lib/python3.11/site-packages/nodestream/databases/neo4j/query_executor.py", line 39, in execute_query_batch
                await self.execute(
              File "/Users/acapria/.pyenv/versions/3.11.8/lib/python3.11/site-packages/nodestream/databases/neo4j/query_executor.py", line 96, in execute
                result = await self.driver.execute_query(
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
              File "/Users/acapria/.pyenv/versions/3.11.8/lib/python3.11/site-packages/neo4j/_async/driver.py", line 903, in execute_query
                return await session._run_transaction(
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
              File "/Users/acapria/.pyenv/versions/3.11.8/lib/python3.11/site-packages/neo4j/_async/work/session.py", line 552, in _run_transaction
                result = await transaction_function(tx, *args, **kwargs)
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
              File "/Users/acapria/.pyenv/versions/3.11.8/lib/python3.11/site-packages/neo4j/_async/driver.py", line 1228, in _work
                res = await tx.run(query, parameters)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
              File "/Users/acapria/.pyenv/versions/3.11.8/lib/python3.11/site-packages/neo4j/_async/work/transaction.py", line 168, in run
                await result._tx_ready_run(query, parameters)
              File "/Users/acapria/.pyenv/versions/3.11.8/lib/python3.11/site-packages/neo4j/_async/work/result.py", line 131, in _tx_ready_run
                await self._run(query, parameters, None, None, None, None, None, None)
              File "/Users/acapria/.pyenv/versions/3.11.8/lib/python3.11/site-packages/neo4j/_async/work/result.py", line 181, in _run
                await self._attach()
              File "/Users/acapria/.pyenv/versions/3.11.8/lib/python3.11/site-packages/neo4j/_async/work/result.py", line 301, in _attach
                await self._connection.fetch_message()
              File "/Users/acapria/.pyenv/versions/3.11.8/lib/python3.11/site-packages/neo4j/_async/io/_common.py", line 188, in inner
                await coroutine_func(*args, **kwargs)
              File "/Users/acapria/.pyenv/versions/3.11.8/lib/python3.11/site-packages/neo4j/_async/io/_bolt.py", line 849, in fetch_message
                res = await self._process_message(tag, fields)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
              File "/Users/acapria/.pyenv/versions/3.11.8/lib/python3.11/site-packages/neo4j/_async/io/_bolt5.py", line 369, in _process_message
                await response.on_failure(summary_metadata or {})
              File "/Users/acapria/.pyenv/versions/3.11.8/lib/python3.11/site-packages/neo4j/_async/io/_common.py", line 245, in on_failure
zprobst commented 5 months ago

Thanks for the issue!

I moved this over to nodestream-plugin-neo4j which is where the specific neo4j handling for nodestream.

To me, it seems, we can probably do a "preflight" check before starting the ingest. One option is to try to call apoc.version() and if that query fails, output a better error message before we start ingesting.

angelosantos4 commented 5 months ago

Also it is a good point when debugging, that that error message may be a bit too verbose. We can talk about moving it away from -v, and putting it in -vv.