apache / incubator-hugegraph-computer

HugeGraph Computer - A distributed graph processing system for hugegraph (OLAP)
https://hugegraph.apache.org/docs/quickstart/hugegraph-computer/
Apache License 2.0
42 stars 41 forks source link

[Bug] RingsDetection test did not meet expectations #275

Closed diaohancai closed 10 months ago

diaohancai commented 10 months ago

Bug Type (问题类型)

data inconsistency (计算结果不合预期)

Before submit

Environment (环境信息)

Expected & Actual behavior (期望与实际表现)

Run org.apache.hugegraph.computer.algorithm.path.rings.RingsDetectionTest#testRunAlgorithm, Vertex A expected result is:

[[A, C, A], [A, D, A], [A, B, C, A], [A, D, C, A], [A, C, E, D, A], [A, B, C, E, D, A]]

But got:

[[A, D, A], [A, D, A], [A, D, C, A], [A, D, C, A], [A, C, E, D, A], [A, B, C, E, D, A]]

I modified some code and it works.

    @Override
    public void compute(ComputationContext context, Vertex vertex,
                        Iterator<IdList> messages) {
        Id id = vertex.id();
        boolean halt = true;
        while (messages.hasNext()) {
            halt = false;
            // IdList sequence = messages.next();  // Something is wrong here
            IdList sequence = messages.next().copy(); // copy() is good
            ...

Further debug, I found messages.next() always return the same address Object(Field value may be different)!!!, it's a reference Object.

Semantically, I think messages.next() should return a different address Object.

diaohancai commented 10 months ago

Possible solutions: org.apache.hugegraph.computer.core.compute.input.MessageInput

        @Override
        public boolean hasNext() {
            if (this.valueValid) {
                return true;
            }
            if (MessageInput.this.messages.hasNext()) {
                KvEntry entry = MessageInput.this.messages.peek();
                Pointer key = entry.key();
                int status = this.vidPointer.compareTo(key);
                if (status == 0) {
                    MessageInput.this.messages.next();
                    this.valueValid = true;
                    try {
                        BytesInput in = IOFactory.createBytesInput(
                                        entry.value().bytes());
                        // create a new value
                        MessageInput.this.value = MessageInput.this.config
                                .createObject(ComputerOptions.ALGORITHM_MESSAGE_CLASS);
                        MessageInput.this.value.read(in);

messages.next() return a different address Object.

corgiboygsj commented 10 months ago

Possible solutions: org.apache.hugegraph.computer.core.compute.input.MessageInput

        @Override
        public boolean hasNext() {
            if (this.valueValid) {
                return true;
            }
            if (MessageInput.this.messages.hasNext()) {
                KvEntry entry = MessageInput.this.messages.peek();
                Pointer key = entry.key();
                int status = this.vidPointer.compareTo(key);
                if (status == 0) {
                    MessageInput.this.messages.next();
                    this.valueValid = true;
                    try {
                        BytesInput in = IOFactory.createBytesInput(
                                        entry.value().bytes());
                        // create a new value
                        MessageInput.this.value = MessageInput.this.config
                                .createObject(ComputerOptions.ALGORITHM_MESSAGE_CLASS);
                        MessageInput.this.value.read(in);

messages.next() return a different address Object.

always return the same Object is reduce memory pressure and avoid GC

diaohancai commented 10 months ago

always return the same Object is reduce memory pressure and avoid GC

Yep~ But semantically, messages.next() returns a different address object more intuitive. Otherwise, someone who call this function downstream need to clearly know that the same object is returned here, which is not logically intuitive and easy to make mistakes.

Do we have some more elegant solutions?

corgiboygsj commented 10 months ago

@javeme please take a look for this question. This does make it confusing for users.

javeme commented 10 months ago

I agree this is a problem, it sacrifices user-friendliness for performance. We hope that after the memory management module is added, this confusion can be solved. it's tracked by https://github.com/apache/incubator-hugegraph-computer/issues/277