graphaware / neo4j-uuid

GraphAware Runtime Module that assigns a UUID to all nodes (and relationships) in the graph transparently
103 stars 22 forks source link

Specifying the Generator Through Configuration #31

Closed espiegelberg closed 8 years ago

espiegelberg commented 8 years ago

I recently needed to have fine grained control over a generated ID in the Neo4j database and, rather than create the entire module myself, I instead decided to extend/modify the UUID module.

This pull request modifies UUID's configuration, bootstrapper, and module logic to allow the UuidGenerator implementation to be specified through configuration. Two UuidGenerator examples are included, one simply making use of the java.util.UUID and another demonstrating the use of the GraphDatabaseService and Cypher to generate a numerically increasing id. The README.md has been updated with configuration examples.

While serving my own needs, this PR also addresses https://github.com/graphaware/neo4j-uuid/issues/21 and facilitates end users for https://github.com/graphaware/neo4j-uuid/issues/16.

Should this PR be accepted and merged, there are two potential long term next steps:

1) Someday it would be nice if the UuidGenerator interface returned Object rather than String, giving implementors more flexibility in what they can return. 2) Given that UuidGenerator implementations may now return values other than a UUID, perhaps this module could/should/would be renamed to something else, such as the GUID module.

Both of the above next steps are probably not worth the time and effort to implement and are instead merely noted for completeness.

espiegelberg commented 8 years ago

Hi @ikwattro. Thanks for your feedback on my PR.

I have incorporated three out of the four points of feedback you provided. For the multithread environment test you're looking for (major point #2), can you please provide some more details on what you'd like to see? I'm guessing you're looking for a test that spins up multiple threads, each that makes use of the GraphDatabaseService to create a node or relationship, and then makes some sort of assertions that the id's are as to be expected. Am I on the right track? That said, I'm not sure how you envision the validation that ensures the sequence values are to be as expected. If you can give more detail as to what you'd like to see from this particular test I'll be happy to put it together.

ikwattro commented 8 years ago

@espiegelberg

Sorry for the delay.

I checked out your PR and ran a test with the following code :

@Test
    public void testSequenceGeneratorInMultithreadedEnv() throws Exception {
        registerModuleWithSequenceGenerator();

        final Runnable runner = () -> {
            try (final Transaction tx = database.beginTx()) {
                database.createNode(Label.label("SequenceTest"));
                tx.success();
            }
        };

        final ExecutorService service = Executors.newCachedThreadPool();
        List<Future> futures = new ArrayList<>();
        IntStream.range(0, 100).forEach(i -> {
            futures.add(service.submit(runner));
        });

        try {
            for (Future future : futures) {
                future.get();
            }
        } catch (Throwable ignore) {}

        service.shutdown();

        Thread.sleep(1000);

        try (Transaction tx = database.beginTx()) {
            Result result = database.execute("MATCH (n:SequenceTest) RETURN n.sequence AS uuid");
            while (result.hasNext()) {
                Map<String, Object> record = result.next();
                System.out.println(record.get("uuid"));
            }
            tx.success();
        }

    }

All sequence properties have a value of 1.

ikwattro commented 8 years ago

Moreover, adding the following at the end of the test :

try (Transaction tx = database.beginTx()) {
            Result result = database.execute("MATCH (n:SequenceMetadata) RETURN count(n) AS c");
            System.out.println(result.next().get("c"));
        }
ikwattro commented 8 years ago

@espiegelberg narrowed down the issue.

a) There is no constraint that prevents multiple SequenceMetadata nodes.

b) If you set one, for example on :SequenceMetadata(id), you'll encounter deadlock exceptions (which is expected). Applying a retry strategy slows down a LOT the process.

Test with the following code to see the behavior :

private static final int MAX_RETRIES = 25;

    public static final String cypher = "MERGE (sequenceMetadata:SequenceMetadata {id: {id} }) SET sequenceMetadata.sequence = coalesce(sequenceMetadata.sequence, {initialSequennceValue}) + {sequenceIncrementAmount} return sequenceMetadata.sequence";

    protected GraphDatabaseService database;

    private String generateUuid(int currentRetry) {
        Map<String, Object> parameters = new HashMap<String, Object>();
        parameters.put("initialSequennceValue", 0);
        parameters.put("sequenceIncrementAmount", 1);
        parameters.put("id", 0);

        try {
            Result result = database.execute(cypher, parameters);

            long sequence = (long) result.next().values().iterator().next();
            return String.valueOf(sequence);
        } catch (Exception e) {
            if (e instanceof DeadlockDetectedException && currentRetry < MAX_RETRIES) {
                return generateUuid(++currentRetry);
            }

            throw new RuntimeException(e);
        }
    }

    @Override
    public String generateUuid() {
        return generateUuid(0);
    }

c) AFAICS, there is nothing that prevents the deletion of the SequenceMetadata node.

ikwattro commented 8 years ago

@espiegelberg All the changes are available in the branch sequence-generator, will be easier for you to test

https://github.com/graphaware/neo4j-uuid/tree/sequence-generator

espiegelberg commented 8 years ago

Hi @ikwattro. Once again, thanks for your feedback.

I don’t see any of my changes on your https://github.com/graphaware/neo4j-uuid/tree/sequence-generator. Please let me know if I’m doing something wrong. In the mean time, I’ve incorporated your changes into my branch/PR.

Your test clearly demonstrates that my original approach to generate the sequence isn’t viable. Using your test as a baseline, I’ve revised my SequenceGenerator from using Cypher to instead making use of the Transaction's acquireWriteLock() and programmatically incrementing the sequence value. As you can see in the updated code, this approach now addresses your points A and B and by defending against multiple SequenceMetadata nodes and avoiding both the deadlock exceptions and the significant performance degradation associated with adding a constraint. Using your test, I am able to validate that all generated UUIDs are as to be expected (i.e.: unique and incrementing) and after the test execution only assert that only a single SequenceMetadata node exists.

As for your point C, while you make a perfectly valid point, I think a case could be made that it is outside the scope of the SequenceGenerator’s responsibilities to prevent the deletion of the SequenceMetadata node. That’s not to say preventing the deletion is unimportant, just that as the UUIDModule does not prevent the deletion of the UUID property from nodes the SequenceGenerator does not necessarily need to prevent the deletion of the SequenceMetadata node. That comparison may not be 100% apples to apples, but hopefully illustrates my point. Additionally, I’d also say that the SequenceGenerator is really intended merely as a demonstration of my PR changes, which allow end users to specify the generator through configuration. While I would advocate that it does not, please give it some thought as to if the SequenceGenerator needs to protected against deleted SequenceMetadata nodes and, if you still feel it is required, I’ll update the code to do so.

Please review my revised approach, particularly the use of the transaction's acquireWriteLock() and programmatically incrementing the sequence value, and let me know of any way this PR can be improved.

ikwattro commented 8 years ago

Thanks @espiegelberg This will ship in the next release today or tomorrow.

ikwattro commented 8 years ago

Released in 3.0.7.44.13

espiegelberg commented 8 years ago

Thanks @ikwattro for your time and effort to see this PR through to acceptance.

Seeing as how this PR has been accepted, I think a case could be made that issue #21 (https://github.com/graphaware/neo4j-uuid/issues/21) could now be closed/resolved as my JavaUtilUUIDGenerator makes use of the java.util.UUID.randomUUID() that @Joshfindit was asking for.

ikwattro commented 8 years ago

Yes thanks