alibaba / GraphScope

🔨 🍇 💻 🚀 GraphScope: A One-Stop Large-Scale Graph Computing System from Alibaba | 一站式图计算系统
https://graphscope.io
Apache License 2.0
3.3k stars 447 forks source link

[BUG] Can not load graph from k8s_volumes on k8s #4315

Closed JackyYangPassion closed 1 week ago

JackyYangPassion commented 1 week ago

Describe the bug 通过K8S 挂载共享目录方式,加载数据失败,日志报错

No such file or directory

Log Deails

Loading vertex labeled user and:   0%|          | 0/10 [00:00<?, ?it/s]I20241108 20:51:20.199137 13430 /tmp/gs-local-deps/v6d-0.24.2/modules/io/io/io_factory.cc:96] Warning: failed to resolve realpath of /testingdata/1405/1406.csv
I20241108 20:51:20.199146 13430 /tmp/gs-local-deps/v6d-0.24.2/modules/io/io/io_factory.cc:96] Warning: failed to resolve realpath of /testingdata/1405
I20241108 20:51:20.199151 13430 /tmp/gs-local-deps/v6d-0.24.2/modules/io/io/io_factory.cc:96] Warning: failed to resolve realpath of /testingdata
E20241108 20:51:20.202028 13430 /work/analytical_engine/core/server/dispatcher.cc:153] Worker 0: VineyardError occurred on worker 0: VineyardError occurred on worker 0: /opt/graphscope/include/graphscope/core/loader/arrow_fragment_loader.h:493: operator() -> Arrow error: IOError: Failed to open local file '/testingdata/1405/1406.csv'. Detail: [errno 2] No such file or directory
gs::ArrowFragmentLoader<std::string, unsigned long, vineyard::ArrowVertexMap>::loadVertexTables(std::vector<std::shared_ptr<gs::detail::Vertex>, std::allocator<std::shared_ptr<gs::detail::Vertex> > > const&, int, int)::{lambda()#3}::operator()() const + 0x58F

status = StatusCode.INTERNAL
    details = "VineyardError occurred on worker 0: VineyardError occurred on worker 0: /opt/graphscope/include/graphscope/core/loader/arrow_fragment_loader.h:493: operator() -> Arrow error: IOError: Failed to open local file '/testingdata/1405/1406.csv'. Detail: [errno 2] No such file or directory
gs::ArrowFragmentLoader<std::string, unsigned long, vineyard::ArrowVertexMap>::loadVertexTables(std::vector<std::shared_ptr<gs::detail::Vertex>, std::allocator<std::shared_ptr<gs::detail::Vertex> > > const&, int, int)::{lambda()#3}::operator()() const + 0x58F

To Reproduce Steps to reproduce the behavior:

挂载目录

import graphscope
graphscope.set_option(log_level='DEBUG')
graphscope.set_option(show_log=True)

k8s_volumes = {
    "data": {
            "type": "hostPath",
            "field": {"path": "/home/data/datasets","type": "Directory"},
            "mounts": {
                "mountPath": "/testingdata"
            }
        }
    }

# Create GraphScope client session, the 'cluster_type' is k8s by default.
session = graphscope.session(
                             k8s_coordinator_cpu=1,
                             k8s_coordinator_mem="1Gi",
                             k8s_vineyard_cpu=3,
                             k8s_vineyard_mem="2Gi",
                             vineyard_shared_mem="2Gi",
                             k8s_engine_cpu=2,
                             k8s_namespace='graph-k8s-jacky1',
                             k8s_engine_mem="2Gi",
                             num_workers=2,
                             enabled_engines="analytical,interactive",
                             k8s_client_config='/etc/config',
                             k8s_volumes=k8s_volumes);
print('========= Session created. ==========')

加载数据

from graphscope.framework.loader import Loader
graph = graph.add_vertices(Loader('/testingdata/1405/1406.csv', delimiter=',')
                           ,label="user"
                           ,vid_field='ud_id'
                          )

Expected behavior A clear and concise description of what you expected to happen.

Screenshots If applicable, add screenshots to help explain your problem.

Environment (please complete the following information):

Additional context Add any other context about the problem here.

JackyYangPassion commented 1 week ago

gs-engine-vchukb-1 gs-engine-vchukb-0

两个pod 内已确认 挂载目录是成功的,但是Load数据报错目录不存在

JackyYangPassion commented 1 week ago

进入 gs-engine-vchukb-1 采用v6d 是可以访问挂载目录

import fsspec
from fsspec.core import split_protocol
fs = fsspec.filesystem("file")
files = fs.glob('file:///testingdata/'+ '*')
print(files)

结果符合预期,但是通过如下方式就报错

from graphscope.framework.loader import Loader
graph = graph.add_vertices(Loader('/testingdata/1405/1406.csv', delimiter=',')
                           ,label="user"
                           ,vid_field='ud_id'
                          )
JackyYangPassion commented 1 week ago

Fix,Not GS BUG