COPY REL fails on "Unable to find primary key value" under the case of large num of long string primary keys (hashtag2916)

Summary: "COPY REL" fails with "Unable to find primary key value" error when loading a large number of long string primary keys.

Possible Solution

Based on the provided information, the issue seems to be related to a race condition or improper synchronization when importing relationships (COPY E) in a multi-threaded environment. The primary keys are long strings, and the problem does not occur when using a single thread.

To address the issue, consider the following solution:

Ensure that the primary key values are fully loaded and indexed before starting the multi-threaded COPY E operation. This may involve a synchronization barrier or a pre-import validation step.
Review the CopyRel::executeInternal function in src/processor/operator/persistent/copy_rel.cpp to ensure that the node group offsets are correctly managed across threads. This function is responsible for writing the relationship data to the table and may need additional synchronization or consistency checks when running with multiple threads.
If the COPY E operation involves looking up primary key values in the N table, ensure that the lookup mechanism is thread-safe and that any shared data structures are properly synchronized.
Consider implementing a locking mechanism or transaction isolation level that prevents concurrent threads from interfering with each other when accessing or modifying shared data structures.
Test the COPY E operation with a smaller subset of the data to isolate the problem and ensure that the issue is not related to the size of the data or the length of the primary keys.
If the issue persists, consider using a thread pool with a limited number of threads to reduce contention and improve synchronization.

Remember to thoroughly test the changes in a controlled environment before deploying them to production to ensure that the issue is resolved without introducing new problems.

Code snippets to check

src/processor/operator/persistent/copy_rel.cpp

This snippet contains the implementation of the CopyRel operator, which is directly involved in the COPY E operation that is failing. It is likely that the issue is related to how the COPY E operation is being executed, especially in a multi-threaded context.
test/copy/e2e_copy_transaction_test.cpp

This snippet includes tests for the COPY operation on relational tables, which is relevant to the issue as it may help identify problems with the COPY E operation under test conditions similar to those described in the issue.

Mayil-AI-Sandbox / kuzudb_jan15

COPY REL fails on "Unable to find primary key value" under the case of large num of long string primary keys (hashtag2916) #3

Possible Solution

Code snippets to check