JanusGraph / janusgraph

JanusGraph: an open-source, distributed graph database
https://janusgraph.org
Other
5.32k stars 1.17k forks source link

Cached schema not updated when using management.addProperties, management.addConnection #1319

Open batliner opened 6 years ago

batliner commented 6 years ago

We're currently building out a schema and testing how to update that schema in the future. We want to use management.addProperties() to map properties to vertices and management.addConnection() to map which vertices can be connected via which edges as seen in the docs. This works fine for the initial schema commit. We can see the changes immediately. However, if we do another schema commit where we create a new PropertyKey and map it to an existing VertexLabel via addProperties (after data has been added to the graph), then the changes to the schema are not immediately seen.

As an example, let's say we have a "animal" VertexLabel initially mapped to a "name" PropertyKey. This is committed and working well. We create an animal vertex. Then we create a new PropertyKey called "color" and map it to the "animal" VertexLabel with a management.addProperties(animal, color); call. Once this has been committed, we can see the new "color" PropertyKey immediately, but it is not mapped to the "animal" VertexLabel until we restart the JanusGraph server.

We're currently running JanusGraph in server mode, connecting to it remotely via a java app. We are running the schema updates when the java app starts via a call to client.submit(), connecting via a Cluster object.

Error received when trying to update the animal vertex with new property: org.apache.tinkerpop.gremlin.driver.exception.ResponseException: Property Key constraint does not exist for given Vertex Label [animal] and property key [color].

The error above goes away once the JanusGraph server is restarted.

The same behavior is seen when adding new connections via management.addConnection().

Problem was originally discussed on janusgraph-users .

Please let me know if any additional information is needed.

ryanpstauffer commented 5 years ago

I've observed similar issues on inmemory JanusGraph instances inside the Gremlin Console, for example created with:

graph = JanusGraphFactory.build().
  set('storage.backend', 'inmemory').
  set('schema.default', 'none').
  set('schema.constraints', true).open()

Here is a failing and passing example (https://gist.github.com/ryanpstauffer/ce28c02c99653c356db6eb4a1c33a211) - with the only difference being the ordering of the schema definition / data insertion ops. The issue appears to stem from changing the graph schema (defining a new Edge Label, etc.) after data has already been added to it.

The issue seems to occur if:

  1. An initial schema has been created and committed
  2. Data is inserted into the graph
  3. The management API is opened again with mgmt = graph.openManagement()
  4. A new Edge Label is created and connected to 2 Vertex Labels, and we mgmt.commit()
  5. A new edge is attempted to be created with the new Edge label This produces an error like: Connection constraint does not exist for given Edge Label [PERFORMED], outgoing Vertex Label [Artist] and incoming Vertex Label [Performance] The error is produced even when the correct definition of the Connection is confirmed.

I'm going to take a deeper look later and see if I can pinpoint the bug.

m0rsch1 commented 5 years ago

We have observed the same behavior on a JanusGraph 0.4.0 and a Cassandra Cluster as backend. A restart of the server fixes the missing addProperties() constraint. However, this renders JanusGraph unusable for us.

bishnuagrawal-zeotap commented 4 years ago

Guys, Any solution you got for this? I can see property while printing graph schema but when the writing starts it throws error "Property Key with the given name does not exist:dpts_123". We have been doing adding properties dynamically and then write data, it was working fine to date but suddenly we have started receiving this error.

allenhadden commented 4 years ago

I've also encountered this issue. This hack seems to work around it:

mgmt.addProperties(edgeLabel, keysToAdd);  // same for addConnection

String oldName = edgeLabel.name();

mgmt.changeName(edgeLabel, UUID.randomUUID().toString());
mgmt.changeName(edgeLabel, oldName);

The UUID part is probably not necessary (you could perhaps just use oldName + "-temp" or something that is guaranteed to not conflict with anything).

If this workaround works generally, the real solution may be to make ManagementSystem's addConnection and addProperties do something like this:

    @Override
    public VertexLabel addProperties(VertexLabel vertexLabel, PropertyKey... keys) {
        VertexLabel returnLabel = transaction.addProperties(vertexLabel, keys);

        JanusGraphSchemaVertex schemaVertex = getSchemaVertex(element);

        schemaVertex.resetCache();
        updatedTypes.add(schemaVertex);

        return returnLabel;
    }

The same change could be made to addConnection (and the other flavor of addProperties).

FWIW, another thing that seemed to work for us was to close-reopen the graph using ConfiguredGraphFactory.close(graphName) and ConfiguredGraphFactory.open(graphName). It would assume that you're using ConfiguredGraphFactory obviously. I didn't do any additional testing on this though...it worked once through Gremlin Console.

Hope this helps.

farodin91 commented 4 years ago

We should think send a cache eviction event to all janusgraph instances.

The ManagementLogger should be able to send theses events: janusgraph-core/src/main/java/org/janusgraph/graphdb/database/management/ManagementLogger.java

farodin91 commented 4 years ago

If some one likes to work on this issue, i would like to help you.

shivamc7y commented 1 year ago

We are facing the same issue, additioanl properties mapped to an existing edgeLabel are not reflecting until we do a restart. Similar issue happens while changing a newly registered index state from INSTALLED to REGISTERED.

We can not rely on restarting the servers as janusgraph is deployed on kube pods and one restarts costs us 206 billion of edge ids.

Has anyone found how the restart can be avoided, we are using v0.6 of janusgraph.

rhinoman commented 1 year ago

Seeing the same issue with 1.0.0-rc2 and ScyllaDB. I've found closing the graph with ConfiguredGraphFactory.close() causes the addProperties call to finally 'take'. This is a terrible workaround - but probably better than restarting Janus. @allenhadden 's renaming workaround above looks like the best bet - and seems to work more reliably than closing/re-opening the graph.