apache / doris

Apache Doris is an easy-to-use, high performance and unified analytics database.
https://doris.apache.org
Apache License 2.0
12.7k stars 3.28k forks source link

[Bug] Could not find subsystem memory in /proc/self/cgroup error stacktrace #32842

Closed larshelge closed 5 months ago

larshelge commented 7 months ago

Search before asking

Version

Doris: 2.0.6 and 2.1.0.

OS: PoPOS 22.04 Linux (PopOS is based on Ubuntu)

What's Wrong?

When starting the BE process I get the following stack trace in the logs.

W20240326 09:55:46.395356  8212 status.h:380] meet error status: [NOT_FOUND]Could not find subsystem memory in /proc/self/cgroup

    0#  doris::CGroupUtil::find_global_cgroup(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*) at /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:187
    1#  doris::CGroupUtil::find_abs_cgroup_path(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*) at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:449
    2#  doris::CGroupUtil::find_cgroup_mem_limit(long*) at /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:187
    3#  doris::MemInfo::init() at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:449
    4#  main at /home/zcp/repo_center/doris_release/doris/be/src/service/doris_main.cpp:473
    5#  ?
    6#  __libc_start_main
    7#  _start
W20240326 09:55:46.397221  8212 status.h:380] meet error status: [NOT_FOUND]Could not find subsystem memory in /proc/self/cgroup

    0#  doris::CGroupUtil::find_global_cgroup(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*) at /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:187
    1#  doris::CGroupUtil::find_abs_cgroup_path(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*) at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:449
    2#  doris::CGroupUtil::find_cgroup_mem_limit(long*) at /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:187
    3#  doris::CGroupUtil::debug_string[abi:cxx11]() at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:449
    4#  doris::MemInfo::debug_string[abi:cxx11]() at /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:187
    5#  main at /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:187
    6#  ?
    7#  __libc_start_main
    8#  _start
W20240326 09:55:46.398967  8212 status.h:380] meet error status: [NOT_FOUND]Could not find subsystem cpu in /proc/self/cgroup

    0#  doris::CGroupUtil::find_global_cgroup(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*) at /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:187
    1#  doris::CGroupUtil::find_abs_cgroup_path(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*) at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:449
    2#  doris::CGroupUtil::find_cgroup_cpu_limit(float*) at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:449
    3#  doris::CGroupUtil::debug_string[abi:cxx11]() at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:345
    4#  doris::MemInfo::debug_string[abi:cxx11]() at /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:187
    5#  main at /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:187
    6#  ?
    7#  __libc_start_main
    8#  _start
W20240326 09:55:46.400684  8212 status.h:380] meet error status: [NOT_FOUND]Could not find subsystem cpuacct in /proc/self/cgroup

    0#  doris::CGroupUtil::find_global_cgroup(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*) at /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:187
    1#  doris::CGroupUtil::find_abs_cgroup_path(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*) at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:449
    2#  doris::CGroupUtil::find_cgroup_cpu_limit(float*) at /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:187
    3#  doris::CGroupUtil::debug_string[abi:cxx11]() at /home/zcp/repo_center/doris_release/doris/be/src/common/status.h:345
    4#  doris::MemInfo::debug_string[abi:cxx11]() at /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:187
    5#  main at /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:187
    6#  ?
    7#  __libc_start_main
    8#  _start

My cgroup file.

lars@pop-os:~$ cat /proc/self/cgroup 
0::/user.slice/user-1000.slice/user@1000.service/app.slice/app-org.gnome.Terminal.slice/vte-spawn-64c83880-a5ae-4c5b-acec-a925a23cd62d.scope

What You Expected?

I expect the stacktrace not to appear.

How to Reproduce?

Start the BE process on a PopUS/Ubuntu 22.04 desktop environment.

Anything Else?

No response

Are you willing to submit PR?

Code of Conduct

geoffreytran commented 6 months ago

Also running into this issue. It appears to be due to the fact that Doris is looking for cgroups v1 rather than cgroups v2.

michael1991 commented 5 months ago

meet same issue on 2.1.3

769344359 commented 5 months ago

meet the same in debian

imbrin1122 commented 5 months ago

meet same issue on doris-2.1.3-rc09

xinyiZzz commented 5 months ago

Upgrade to Doris 2.1.4, fix BE cgroup.memory find failed, if system is not configured with cgroup, look https://github.com/apache/doris/pull/35412,

In Doris 2.1.4, also can add enable_use_cgroup_memory_info=false in be.conf to avoid unnecessary check.

Doris memory not support cgroups v2 yet, this will be a TODO. @larshelge @imbrin1122 @769344359 @michael1991

xinyiZzz commented 5 months ago

Close issue. If have any questions, can continue comment.