orientechnologies / orientdb-gremlin

TinkerPop3 Graph Structure Implementation for OrientDB
Apache License 2.0
91 stars 32 forks source link

Duplicate indexing Nulls #116

Open fppt opened 7 years ago

fppt commented 7 years ago

Hi Guys,

I create indices the following way:

graph.createVertexIndex("prop1", "label1", indexConfig);
graph.createVertexIndex("prop2", "label1", indexConfig);
graph.createVertexIndex("prop1", "label2", indexConfig);
graph.createVertexIndex("prop2", "label2", indexConfig);

The problem is that when I create vertices I am not guaranteed to fill every property for every vertex immediately. For example I could create the following:

Vertex v1 = graph.addVertex("label1");
Vertex v2 = graph.addVertex("label1");
Vertex v3 = graph.addVertex("label1");

v1.property("prop1", "A");
v1.property("prop2", "B");
v2.property("prop1", "C");
v3.property("prop1", "D");

When I try to commit I get the following error:

com.orientechnologies.orient.core.storage.ORecordDuplicatedException: Cannot index record V_label1: found duplicated key 'null' in index 'V_label1.prop2' previously assigned to the record #81:0

If I add the following to the above:

v2.property("prop2", "E");
v3.property("prop2", "F");

It then works.

Is there a way around this limitation? Currently I just prefill all vertex properties with garbage values but this feels odd.

Thanks guys.

luigidellaquila commented 7 years ago

Hi @fppt

Since v.2.2 unique indexes also consider null values. You can disable this setting when you create the index setting ignoreNullValues=true, see http://orientdb.com/docs/2.2.x/SQL-Create-Index.html

Anyway, I suggest you not to use it, because indexes without null values cannot be used for some query optimizations, so you will lose performance in some corner cases

Thanks

Luigi

luigidellaquila commented 7 years ago

I just checked that docs page, it contains some outdated info. In v 2.2 null values are NOT ignored by default. I'm fixing it now

Thanks

Luigi

fppt commented 7 years ago

Thanks @luigidellaquila. For the time being is there a way to disable indexing nulls using the available API of this plugin?

luigidellaquila commented 7 years ago

How are you creating the index? Is it a single key index?

Thanks

Luigi

fppt commented 7 years ago

I create it as follows:

BaseConfiguration indexConfig = new BaseConfiguration();
indexConfig.setProperty("keytype", OType.String);
indexConfig.setProperty("type", "UNIQUE");
graph.createVertexIndex("prop1", "label1", indexConfig);
luigidellaquila commented 7 years ago

I think this should be enough:

BaseConfiguration indexConfig = new BaseConfiguration();
indexConfig.setProperty("keytype", OType.String);
indexConfig.setProperty("type", "UNIQUE");

indexConfig.setProperty("metadata.ignoreNullValues", true);

graph.createVertexIndex("prop1", "label1", indexConfig);

Thanks

Luigi

fppt commented 7 years ago

I added indexConfig.setProperty("metadata.ignoreNullValues", true); no joy sadly. Still getting the same exception.

Eric24 commented 5 years ago

@luigidellaquila : Can you expand on your statement to not use "unique, ignored-nulls" indexes because of optimization issues in some edge cases? It's something we're doing quite a bit of (in cases where a property is often the same for 99% of the documents, but we need to quickly find the other 1%). I haven't seen any roblems, but I'd like to understand what to watch out for.

luigidellaquila commented 5 years ago

Hi @Eric24

The query execution planner used to skip index usages in some conditions, when the index was configured to exclude null values. Anyway, in v 3.0 and with the new execution planner we solved this problem, so it's not a concern anymore

Thanks

Luigi