neo4j / apoc

Apache License 2.0
81 stars 27 forks source link

Inconsistent treatment of escape character "\" when performing apoc.import.csv on nodes vs relationships #633

Closed quan-xue closed 2 months ago

quan-xue commented 2 months ago

Guidelines

Please note that GitHub issues are only meant for bug reports/feature requests. If you have questions on how to use APOC, please ask on the Neo4j Discussion Forum instead of creating an issue here.

Expected Behavior (Mandatory)

When running apoc.import.csv, there should be consistent escaping behaviour. E.g. \ is used to escape special characters for both nodes and relationship files.

Actual Behavior (Mandatory)

For nodes, there is no escaping in character by placing \ in front of the character. Whereas for relationships, when defining the node to attach the relationship to, escaping after \ is applied.

How to Reproduce the Problem

Simple Dataset (where it's possible)

Node file

blah_code:ID(blah_code),:LABEL
806^04^150\\^123456,blah
2,address

Relationship file (note the same blah_code as above)

:START_ID(blah_code),:END_ID(blah_code),:TYPE
806^04^150\\^123456,2,friends_with

Steps (Mandatory)

  1. Create node and relationship files using sample datasets above
  2. CALL apoc.import.csv([{fileName: 'file:test.csv', labels:['blah']}],[],{ignoreDuplicateNodes: true}); You will see that the node is created with id 806^04^150\\^123456
  3. CALL apoc.import.csv([{fileName: 'file:test.csv', labels:['blah']}],[{fileName: 'file:test_relationships.csv', type: 'friends_with'}],{ignoreDuplicateNodes: true}); Error is returned Failed to invoke procedureapoc.import.csv: Caused by: java.lang.IllegalStateException: Node for id space blah_code and id 806^04^150\^123456 not found
  4. After modifying the relationship file to the following, with escaping on the two \\, the error is resolved, suggesting escaping is performed when performing the relationship creation.
    :START_ID(blah_code),:END_ID(blah_code),:TYPE
    806^04^150\\\\^123456,2,friends_with

Screenshots (where it's possible)

Screenshot 2024-05-31 at 8 51 51 PM Screenshot 2024-05-31 at 8 54 11 PM

Specifications (Mandatory)

Currently used versions

Versions

quan-xue commented 2 months ago

I'm not sure if it is due to the recently merged in fix here which changes the handling of escape character - https://github.com/neo4j/apoc/pull/572

gem-neo4j commented 2 months ago

Hi! Thanks for reporting, this does seem strange 🤔 Will take a look and get back to you :)

gem-neo4j commented 2 months ago

Fixed here, the fix will be released in the next major release which should be 5.22 :)

quan-xue commented 2 months ago

@gem-neo4j awesome, thanks so much for the quick fix! Might you know when 5.22 will be released?

gem-neo4j commented 2 months ago

We release roughly on a monthly cadence, and the 5.21 cutoff was last week, so 5.22 will be released in a little over a month I would guess :)