rdkit / neo4j-rdkit

BSD 2-Clause "Simplified" License
28 stars 6 forks source link

change node creation behaviour ("luri") #13

Closed pi-at-git closed 3 years ago

pi-at-git commented 3 years ago

CREATE (n:Entity:Chemical:Compound:Structure { luri: 'test1', preferred_name: 'chloro benzene', smiles: 'ClC1=CC=CC=C1'}) creates a node the chemical structure for chloro benzene:

The property "luri" is supposed to be a unique resource identifier for the node (originally: legacy uri). I believe it makes sense to have such resource identifier on nodes bearing structures. Suggestion: if luri-attribute is not deliberately set by user upon data ingestion the plugin should create the luri-property and assign a UUID to it

The query CALL org.rdkit.search.exact.smiles(['Chemical', 'Structure'], 'ClC1=CC=CC=C1') yields:

{ "columns" : [ "name", "luri", "canonical_smiles" ], "data" : [ [ "chloro benzene", "test1", "Clc1ccccc1" ] ] }

in CREATE the property "preferred_name" was set, query delivers a property "name". Suggestion: eliminate "name" from output of any search query (org.rdkit.search.exact.smiles, org.rdkit.search.exact.mol, org.rdkit.search.substructure.smiles, org.rdkit.search.substructure.mol). treat preferred_name in CREATE statement as any other property

sarmbruster commented 3 years ago

Auto assignment of UUIDs is already an existing feature of the APOC library, see https://neo4j.com/labs/apoc/4.1/overview/apoc.uuid/apoc.uuid.install/

To auto-assign a luri to all nodes with label Entity is as simple as running once: `CALL apoc.uuid.install('Entity', {uuidProperty: 'luri'})

Since we have an approriate solution in place, I'll close that issue off.