Closed MidsummerNight closed 5 months ago
我们已经开发了针对集群安装的一键化自动部署方案,目前已经在centos和ubuntu系统下进行了完整性测试,近期会发布,使用工具可以指定主控节点和计算节点ip和常用配置参数,就能完成集群的所有配置安装
尝试在命令中添加-CRANE_USE_GITEE_SOURCE=OFF,报出以下错误,CRANE_USE_GITEE_SOURCE不知为何被识别成了RANE_USE_GITEE_SOURCE
将-CRANE_USE_GITEE_SOURCE=OFF 改成 -DCRANE_USE_GITEE_SOURCE=OFF 应该能解决你的问题。
安装文档确实疏于维护,现在比如已经不依赖boost了。Gitee最近没有更新依赖包,加上依赖版本更新,所以把Gitee未更新的地方加了个Error,已把gitee默认选项关闭。 #238
您们好,将-CRANE_USE_GITEE_SOURCE=OFF
改成-DCRANE_USE_GITEE_SOURCE=OFF
后确实解决了前述问题,但又报出如下错误:
CMake Error at /usr/share/cmake/Modules/FindPackageHandleStandardArgs.cmake:230 (message):
Could NOT find LibAIO (missing: LIBAIO_LIBRARY LIBAIO_INCLUDE_DIR)
Call Stack (most recent call first):
/usr/share/cmake/Modules/FindPackageHandleStandardArgs.cmake:600 (_FPHSA_FAILURE_MESSAGE)
CMakeModule/FindLibAIO.cmake:8 (FIND_PACKAGE_HANDLE_STANDARD_ARGS)
CMakeLists.txt:223 (find_package)
-- Configuring incomplete, errors occurred!
[sysadmin@el8 build]$
完整版的终端输出信息请见附件。 terminal_output.log
您们好,将
-CRANE_USE_GITEE_SOURCE=OFF
改成-DCRANE_USE_GITEE_SOURCE=OFF
后确实解决了前述问题,但又报出如下错误:CMake Error at /usr/share/cmake/Modules/FindPackageHandleStandardArgs.cmake:230 (message): Could NOT find LibAIO (missing: LIBAIO_LIBRARY LIBAIO_INCLUDE_DIR) Call Stack (most recent call first): /usr/share/cmake/Modules/FindPackageHandleStandardArgs.cmake:600 (_FPHSA_FAILURE_MESSAGE) CMakeModule/FindLibAIO.cmake:8 (FIND_PACKAGE_HANDLE_STANDARD_ARGS) CMakeLists.txt:223 (find_package) -- Configuring incomplete, errors occurred! [sysadmin@el8 build]$
完整版的终端输出信息请见附件。 terminal_output.log
在你的环境中用dnf或者源码安装libaio库应该能解决这个问题
您们好,安装libaio-devel
后,上述问题得到了解决,但是出现了大量形如以下报告的错误(区别仅在于CMake Error所在的位置):
CMake Error at src/CraneCtld/CMakeLists.txt:1 (add_executable):
The install of the cranectld target requires changing an RPATH from the
build tree, but this is not supported with the Ninja generator unless on an
ELF-based or XCOFF-based platform. The CMAKE_BUILD_WITH_INSTALL_RPATH
variable may be set to avoid this relinking step.
完整终端输出请参见terminal_output_2.log
您们好,安装
libaio-devel
后,上述问题得到了解决,但是出现了大量形如以下报告的错误(区别仅在于CMake Error所在的位置):CMake Error at src/CraneCtld/CMakeLists.txt:1 (add_executable): The install of the cranectld target requires changing an RPATH from the build tree, but this is not supported with the Ninja generator unless on an ELF-based or XCOFF-based platform. The CMAKE_BUILD_WITH_INSTALL_RPATH variable may be set to avoid this relinking step.
完整终端输出请参见terminal_output_2.log
可以使用make编译或者不要使用install
您们好,感谢回复,CraneSched/
和CraneSched/build
下均没有MakeFile,无法运行make。至于“不要使用install”,指的是把CraneSched/CMakeLists.txt
末尾的install binaries
和Install configuration files
段落全部注释掉,再运行cmake吗?
您们好,感谢回复,
CraneSched/
和CraneSched/build
下均没有MakeFile,无法运行make。至于“不要使用install”,指的是把CraneSched/CMakeLists.txt
末尾的install binaries
和Install configuration files
段落全部注释掉,再运行cmake吗?
将cmake -G Ninja -DCMAKE_C_COMPILER=/opt/rh/gcc-toolset-11/root/usr/bin/gcc -DCMAKE_CXX_COMPILER=/opt/rh/gcc-toolset-11/root/usr/bin/g++ -DBoost_INCLUDE_DIR=/usr/include/boost169/ -DBoost_LIBRARY_DIR=/usr/lib64/boost169/ ..
中的 -G Ninja去掉,清理cmake生成文件并重新运行cmake就会生成makefile
按照上述操作运行cmake,遇到了Could NOT find SASL2
、Could NOT find libbfd
、Could NOT find libdwarf
、Could NOT find Gnuplot
的错误,最终出现Configuring incomplete, errors occurred!
的失败信息。在执行sudo dnf install cyrus-sasl-devel binutils-devel libdwarf libdwarf-devel gnuplot
解决。
解决上述问题后,出现了两个现象:
终端输出这样的错误:
CMake Error at /home/sysadmin/CraneSched-master/build/_deps/backward-subbuild/backward-populate-prefix/tmp/backward-populate-gitupdate.cmake:97 (execute_process):
execute_process failed command indexes:
1: "Child return code: 128"
这个错误似乎是偶发的,有时发生,有时不发生。
libaio
时相同的报错:您们好,安装
libaio-devel
后,上述问题得到了解决,但是出现了大量形如以下报告的错误(区别仅在于CMake Error所在的位置):CMake Error at src/CraneCtld/CMakeLists.txt:1 (add_executable): The install of the cranectld target requires changing an RPATH from the build tree, but this is not supported with the Ninja generator unless on an ELF-based or XCOFF-based platform. The CMAKE_BUILD_WITH_INSTALL_RPATH variable may be set to avoid this relinking step.
完整终端输出请参见terminal_output_2.log
两种情况的完整输出,请分别参见 terminal_output_3_case_1.log 和 terminal_output_3_case_2.log
按照上述操作运行cmake,遇到了
Could NOT find SASL2
、Could NOT find libbfd
、Could NOT find libdwarf
、Could NOT find Gnuplot
的错误,最终出现Configuring incomplete, errors occurred!
的失败信息。在执行sudo dnf install cyrus-sasl-devel binutils-devel libdwarf libdwarf-devel gnuplot
解决。解决上述问题后,出现了两个现象:
- 终端输出这样的错误:
CMake Error at /home/sysadmin/CraneSched-master/build/_deps/backward-subbuild/backward-populate-prefix/tmp/backward-populate-gitupdate.cmake:97 (execute_process): execute_process failed command indexes: 1: "Child return code: 128"
这个错误似乎是偶发的,有时发生,有时不发生。 2. 不发生上述错误时,又会出现与之前缺少
libaio
时相同的报错:您们好,安装
libaio-devel
后,上述问题得到了解决,但是出现了大量形如以下报告的错误(区别仅在于CMake Error所在的位置):CMake Error at src/CraneCtld/CMakeLists.txt:1 (add_executable): The install of the cranectld target requires changing an RPATH from the build tree, but this is not supported with the Ninja generator unless on an ELF-based or XCOFF-based platform. The CMAKE_BUILD_WITH_INSTALL_RPATH variable may be set to avoid this relinking step.
完整终端输出请参见terminal_output_2.log
两种情况的完整输出,请分别参见 terminal_output_3_case_1.log 和 terminal_output_3_case_2.log
第一种情况是你的网络问题。建议将build目录删除重新cmake,如果可以建议使用centos7系统
您们好,删除、新建build目录后重新cmake,在build目录下出现了MakeFile,但是执行make期间,报出了以下错误:
[ 70%] Running gRPC C++ protocol buffer compiler on PublicDefs.proto
/home/sysadmin/CraneSched-master/generated/protos/: No such file or directory
make[2]: *** [protos/CMakeFiles/crane_proto_lib.dir/build.make:74: /home/sysadmin/CraneSched-master/generated/protos/PublicDefs.grpc.pb.cc] Error 1
make[1]: *** [CMakeFiles/Makefile2:10227: protos/CMakeFiles/crane_proto_lib.dir/all] Error 2
make: *** [Makefile:166: all] Error 2
cmake和make的完整输出请见附件 terminal_output_4_cmake.log terminal_output_4_make.log
generated/protos/ 把这个目录mkdir一下就行 cmakelist里面确实少写了一行 ninja会自动建立目录 但是make不会导致出错
您们好,按照您们的指导执行make以后,出现了该提示:
[100%] Building CXX object src/Craned/CMakeFiles/craned.dir/CranedServer.cpp.o
/home/sysadmin/CraneSched-master/src/Craned/CranedServer.cpp: In member function ‘virtual grpc::Status Craned::CranedServiceImpl::SrunXStream(grpc::ServerContext*, grpc::ServerReaderWriter<crane::grpc::SrunXStreamReply, crane::grpc::SrunXStreamRequest>*)’:
/home/sysadmin/CraneSched-master/src/Craned/CranedServer.cpp:215:56: warning: ‘CraneErr Craned::TaskManager::SpawnInteractiveTaskAsync(uint32_t, std::string, std::__cxx11::list<std::__cxx11::basic_string<char> >, std::function<void(std::__cxx11::basic_string<char>&&, void*)>, std::function<void(bool, int, void*)>)’ is deprecated [-Wdeprecated-declarations]
215 | err = g_task_mgr->SpawnInteractiveTaskAsync(
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
216 | task_id, request.exec_info().executive_path(), std::move(args),
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
217 | std::move(output_callback), std::move(finish_callback));
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /home/sysadmin/CraneSched-master/src/Craned/CranedServer.h:24,
from /home/sysadmin/CraneSched-master/src/Craned/CranedServer.cpp:17:
/home/sysadmin/CraneSched-master/src/Craned/TaskManager.h:190:27: note: declared here
190 | [[deprecated]] CraneErr SpawnInteractiveTaskAsync(
| ^~~~~~~~~~~~~~~~~~~~~~~~~
[100%] Building CXX object src/Craned/CMakeFiles/craned.dir/Craned.cpp.o
[100%] Linking CXX executable craned
[100%] Built target craned
其余目标皆构建完毕,是否意味着CraneSched后端已经编译完毕? 完整输出信息在此 terminal_output_5_make.log
目前正在编译调度器后端,根据文档执行至第4不,在Crane-FrontEnd/protos
下进行“生成proto文件”一步发生如下错误:
[root@el8 protos]# protoc --go_out=../generated --go-grpc_out=../generated ./*
protoc-gen-go-grpc: program not found or is not executable
Please specify a program using absolute path or make sure the program is available in your PATH system variable
--go-grpc_out: protoc-gen-go-grpc: Plugin failed with status code 1.
安装插件时,使用了阿里云的goproxy镜像后安装插件:
export GOPROXY=https://mirrors.aliyun.com/goproxy/
go install google.golang.org/protobuf/cmd/protoc-gen-go@latest
go install google.golang.org/grpc/cmd/protoc-gen-go-grpc@latest
前面的安装protoc和拉取代码的步骤均顺利执行完,没有报错。
后端算是编译完了 前端那个问题查下自己的path 把那个plugin的binary在的目录加进去
您们好,在卸载go及其模块后,从头按照前端安装文档进行配置,解决了上述问题,但在进行第4步编译二进制文件时,构建cbatch
、ccancel
、ccontrol
、cinfo
、cinfo
均收到提示要求Go版本大于等于1.20。于是我又卸载了根据教程安装的Go 1.17.3,通过dnf install golang
安装了AlmaLinux PowerTools源中的Go 1.20.12,其余步骤遵照安装文档进行(GOROOT
和GOPATH
要设置为go env
命令所列出来的值),终于完成前端编译。
接下来尝试完成后端的配置工作(从第5步配置PAM开始),之前编译完后端以后忘了继续。
您们好,您们配置PAM的部分看不出/etc/pam.d/sshd
中哪些是红色行,我就把sshd
文件完全修改成您们教程中的样子。
在mongodb部分末尾的db.auth("admin","123456")
用了中文括号,改为英文之后,又报出了以下信息:
test> db.auth("admin","123456")
MongoServerError[AuthenticationFailed]: Authentication failed.
但是之前创建用户的操作是成功的:
test> use admin
switched to db admin
admin> db.createUser({
... user:'admin', pwd:'123456', roles:[{ role:'root',db:'admin'}]
... })
{ ok: 1 }
admin>
接下来关闭服务器的操作是这样的:
admin> db.shutdownServer()
MongoNetworkError: connection 5 to 127.0.0.1:27017 closed
admin> quit
(admin
的密码的确设置为了123456
)
我们之前没有接触过MongoDB,是不是哪里搞错了?
麻烦这些配置请谷歌一下吧 算是比较基础的内容了
On Tue, Mar 19, 2024 at 11:16 Steve @.***> wrote:
您们好,您们配置PAM的部分看不出/etc/pam.d/sshd中哪些是红色行,我就把sshd文件完全修改成您们教程中的样子。
在mongodb部分末尾的db.auth("admin","123456")用了中文括号,改为英文之后,又报出了以下信息:
test> db.auth("admin","123456") MongoServerError[AuthenticationFailed]: Authentication failed.
但是之前创建用户的操作是成功的:
test> use admin switched to db admin admin> db.createUser({ ... user:'admin', pwd:'123456', roles:[{ role:'root',db:'admin'}] ... }) { ok: 1 } admin>
接下来关闭服务器的操作是这样的:
admin> db.shutdownServer() MongoNetworkError: connection 5 to 127.0.0.1:27017 closed admin> quit
(admin的密码的确设置为了123456) 我们之前没有接触过MongoDB,是不是哪里搞错了?
— Reply to this email directly, view it on GitHub https://github.com/PKUHPC/CraneSched/issues/237#issuecomment-2005677609, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHVVKZWAGTQ6USS4K22ORGLYY6UYTAVCNFSM6AAAAABEKKYYAKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMBVGY3TONRQHE . You are receiving this because you commented.Message ID: @.***>
部署环境:AlmaLinux 8.9
说明文档中需要调整的内容:
chrony
取代了ntp
,因此安装时钟的部分应当改为dnf install chrony
。libcgroup-devel boost169-devel boost169-static zlib-devel zlib-static
:dnf config-manager --set-enabled powertools
,dnf install almalinux-release-devel
、dnf install epel-release
,参照AlmaLinux官方Wiki。devtoolset-11
,使用AlmaLinux Appstream
中的gcc-toolset
代之:dnf install install gcc-toolset-11
(参见红帽官方文档)。Git
版本为2.39
,CMake
版本为3.26
,满足调度器最低需求,因此无需额外下载编译CMake
和安装rh-git218
两个工具。gcc-toolset
代替dev-toolset
,使用scl enable gcc-toolset-11 bash
创建一个使用GCC 11的Shell会话(而不是source scl_source enable devtoolset-11
)。gcc-toolset
代替dev-toolset
,首次编译时,CMake
和Ninja
的配置选项也要相应改成cmake -G Ninja -DCMAKE_C_COMPILER=/opt/rh/gcc-toolset-11/root/usr/bin/gcc -DCMAKE_CXX_COMPILER=/opt/rh/gcc-toolset-11/root/usr/bin/g++ -DBoost_INCLUDE_DIR=/usr/include/boost169/ -DBoost_LIBRARY_DIR=/usr/lib64/boost169/ ..
。部署过程中遇到的问题: 我们从GitHub上下载了CraneSched的源码压缩包进行编译。执行
cmake -G Ninja -DCMAKE_C_COMPILER=/opt/rh/gcc-toolset-11/root/usr/bin/gcc -DCMAKE_CXX_COMPILER=/opt/rh/gcc-toolset-11/root/usr/bin/g++ -DBoost_INCLUDE_DIR=/usr/include/boost169/ -DBoost_LIBRARY_DIR=/usr/lib64/boost169/ ..
时,遇到以下报错:尝试在Shell中执行
export CRANE_USE_GITEE_SOURCE=OFF
,报错相同:尝试在命令中添加
-CRANE_USE_GITEE_SOURCE=OFF
,报出以下错误,CRANE_USE_GITEE_SOURCE
不知为何被识别成了RANE_USE_GITEE_SOURCE
:请问这个问题出在哪里?应该如何解决?