WeBankFinTech / Prophecis

Prophecis is a one-stop cloud native machine learning platform.
https://github.com/WeBankFinTech/Prophecis
Apache License 2.0
480 stars 153 forks source link

Prophecis-0.3.0部署步骤 #57

Open wallyell opened 2 years ago

wallyell commented 2 years ago

安装helm-3.2.1:

wget https://get.helm.sh/helm-v3.2.1-linux-amd64.tar.gz

tar -xzvf helm-v3.2.1-linux-amd64.tar.gz

cd linux-amd64/

mv helm /usr/bin/

helm version

helm repo list

helm repo add aliyuncs https://apphub.aliyuncs.com

安装Istio-1.8.2

wget https://github.com/istio/istio/releases/download/1.8.2/istio-1.8.2-linux-amd64.tar.gz

#设置istioctl环境变量
export PATH=$PATH:/opt/istio-1.8.2/bin

#部署
istioctl install
#验证,查看相关Pod是否正常Running
kubectl -n istio-system get pods 

安装seldon core-1.13.0

wget https://github.com/SeldonIO/seldon-core/archive/refs/tags/v1.13.0.tar.gz

cd seldon-core-1.13.0/helm-charts

helm install seldon-core seldon-core-operator --set usageMetrics.enabled=true --namespace seldon-system --set istio.enabled=true

#如果镜像拉取报错
docker pull registry.cn-shenzhen.aliyuncs.com/shikanon/google_containers.spartakus-amd64:v1.1.0
docker tag registry.cn-shenzhen.aliyuncs.com/shikanon/google_containers.spartakus-amd64:v1.1.0 gcr.io/google_containers/spartakus-amd64:v1.1.0

helm list -n seldon-system

helm del seldon-core -n seldon-system 

安装nfs:

yum install -y nfs-utils rpcbind

systemctl start rpcbind
systemctl enable rpcbind

systemctl start nfs-server
systemctl enable nfs-server

环境准备

1.增加配置文件
vim /root/.docker/config.json
#增加如下配置
{
        "auths": {
                "": {
                        "auth": ""
                }
        },
        "HttpHeaders": {
                "User-Agent": "Docker-Client/20.10.8-ce (linux)"
        }
}
2.NFS服务端挂载共享文件
mkdir -p /data/bdap-ss/mlss-data/tmp
mkdir -p /mlss/di/jobs/prophecis
mkdir -p /cosdata/mlss-test

vim /etc/exports
/data/bdap-ss/mlss-data/tmp xx.xx.xx.0/24(rw,sync,no_root_squash)
/mlss/di/jobs/prophecis xx.xx.xx.0/24(rw,sync,no_root_squash)
/cosdata/mlss-test xx.xx.xx.0/24(rw,sync,no_root_squash)

exportfs -arv
3.NFS客户端挂载共享文件
showmount -e xx.xx.xx.xx

mkdir -p /data/bdap-ss/mlss-data/tmp
mkdir -p /mlss/di/jobs/prophecis
mkdir -p /cosdata/mlss-test

mount xx.xx.xx.xx:/data/bdap-ss/mlss-data/tmp /data/bdap-ss/mlss-data/tmp
mount xx.xx.xx.xx:/mlss/di/jobs/prophecis /mlss/di/jobs/prophecis
mount xx.xx.xx.xx:/cosdata/mlss-test /cosdata/mlss-test
4.调整部分问题

(1) 文件重复问题:

/install/Prophecis/templates/di 文件下:learner-configmap.yml 与 learner-rsa-keys.yml 移动至 /install/Prophecis/templates/services 下,然后删除 /install/Prophecis/templates/di 文件夹。

(2) 镜像地址问题

安装配置文件中,所有 uat.sf.dockerhub.stgwebank/webank/prophecis 的镜像地址 更换成 wedatasphere/prophecis

(3) 修改sql数据库配置信息

install/sql下,数据库创建文件:

prophecis.sql 与 prophecis-data.sql 前两行的数据库地址

CREATE DATABASE IF NOT EXISTS `mlss_gzpc_bdap_uat_01` /*!40100 DEFAULT CHARACTER SET utf8 */;
USE `mlss_gzpc_bdap_uat_01`;

改成自己的数据库地址,地址对应下一条中mysql配置的 db —> name

然后,先后复制 prophecis.sql 与 prophecis-data.sql 内容至数据库sql脚本编辑器中执行,生成对应表与文件。

5.修改配置信息

/install/Prophecis/values.yaml中需要修改如下部分

# 改成自己的,mysql的用户名密码
db:
  server: 127.0.0.1
  port: 3306
  name: prophecis_db
  user: prophecis
  pwd: prophecis@wedatasphere

# 用户访问的网页地址,改成宿主机节点ip
gateway:
  address: 127.0.0.1
  port: 30778

#超级管理员的用户名密码,可以改成自己需要的,需对应数据库表t_superadmin
admin:
    user: hadoop
    password: hadoop

安装各个组件

kubectl create namespace prophecis

kubectl label nodes xx.xx.xx.xx mlss-node-role=platform
#如果有GPU计算节点,则标注NVIDIAGPU
kubectl label nodes xx.xx.xx.xx hardware-type=NVIDIAGPU

## 安装Notebook Controller组件
helm install notebook-controller ./notebook-controller
## 安装MinIO组件
helm install minio-prophecis --namespace prophecis ./MinioDeployment
## 安装prophecis组件
helm install prophecis ./Prophecis

#查看与删除
helm list --all
helm del prophecis --namespace default
helm del notebook-controller --namespace default
helm del minio-prophecis --namespace prophecis