apache / incubator-hugegraph-computer

HugeGraph Computer - A distributed graph processing system for hugegraph (OLAP)
https://hugegraph.apache.org/docs/quickstart/hugegraph-computer/
Apache License 2.0
42 stars 41 forks source link

improve(core): vertices and edges loading optimization #288

Open diaohancai opened 9 months ago

diaohancai commented 9 months ago

Feature Description (功能描述)

ComputerOptions.INPUT_FILTER_CLASS default value is DefaultInputFilter. org.apache.hugegraph.computer.core.input.filter.DefaultInputFilter

public class DefaultInputFilter implements InputFilter {

    @Override
    public Vertex filter(Vertex vertex) {
        vertex.properties().clear();
        return vertex;
    }

    @Override
    public Edge filter(Edge edge) {
        edge.properties().clear();
        return edge;
    }
}

Load all properties first, then clear.

Could we specify some properties when loading vertices or edges? May could improve performance. Just like mysql:

select a, b, c

instead of

select *

org.apache.hugegraph.computer.core.input.hg.HugeVertexFetcher

    @Override
    public Iterator<Vertex> fetch(InputSplit split) {
        Shard shard = toShard(split);
        return this.client().traverser().iteratorVertices(shard,
                                                          this.pageSize());
    }

But it seems that the traverser api does not yet support specifying some properties to load vertices or edges on the server side.