alibaba / GraphScope

🔨 🍇 💻 🚀 GraphScope: A One-Stop Large-Scale Graph Computing System from Alibaba | 一站式图计算系统
https://graphscope.io
Apache License 2.0
3.24k stars 438 forks source link

feat(interactive): Automatically generate configuration yaml for stored procedure. #3170

Open zhanglei1949 opened 1 year ago

zhanglei1949 commented 1 year ago

Stored procedure is an important component in Graphscope Interactive. Previously, we needed users to manually write the input and output formats in YAML files. However, this file should actually be automatically generated.

zhanglei1949 commented 3 months ago

For example, for the following c++ procedure,

#include "flex/engines/hqps_db/app/interactive_app_base.h"
#include "flex/engines/hqps_db/core/sync_engine.h"
#include "flex/utils/app_utils.h"

namespace gs {
class ExampleQuery : public CypherReadAppBase<int32_t> {
 public:
  using Engine = SyncEngine<gs::MutableCSRInterface>;
  using label_id_t = typename gs::MutableCSRInterface::label_id_t;
  using vertex_id_t = typename gs::MutableCSRInterface::vertex_id_t;
  ExampleQuery() {}
  // Query function for query class
  results::CollectiveResults Query(const gs::GraphDBSession& sess,
                                   int32_t param1) override {
    LOG(INFO) << "param1: " << param1;
    gs::MutableCSRInterface graph(sess);
    auto ctx0 = Engine::template ScanVertex<gs::AppendOpt::Persist>(
        graph, 0, Filter<TruePredicate>());

    auto ctx1 = Engine::Project<PROJ_TO_NEW>(
        graph, std::move(ctx0),
        std::tuple{gs::make_mapper_with_variable<INPUT_COL_ID(0)>(
            gs::PropertySelector<int64_t>("id"))});
    auto ctx2 = Engine::Limit(std::move(ctx1), 0, 5);
    auto res = Engine::Sink(graph, ctx2, std::array<int32_t, 1>{0});
    LOG(INFO) << "res: " << res.DebugString();
    return res;
  }
};
}  // namespace gs

extern "C" {
void* CreateApp(gs::GraphDBSession& db) {
  gs::ExampleQuery* app = new gs::ExampleQuery();
  return static_cast<void*>(app);
}

void DeleteApp(void* app) {
  gs::ExampleQuery* casted = static_cast<gs::ExampleQuery*>(app);
  delete casted;
}
}

We should be able to generate the following yaml containing the metadata of the procedure

name: count_vertex_num
description: A stored procedure that does something else
library: /tmp/temp_workspace/data/modern_graph/plugins/libsample_app.so
type: cpp
params:
  - name: var1
  - type: {primitive_type: DT_SIGNED_INT64}
returns:
  - name: res
    type: {primitive_type: DT_SIGNED_INT64}
jieefeng commented 1 month ago

Hi.Can I do it in java?

zhanglei1949 commented 1 month ago

Sure, you can do it in Java or Python.