shikanon / kubeflow-manifests

kubeflow国内一键安装文件
GNU General Public License v3.0
338 stars 117 forks source link

Unknown database 'mlpipeline'" #29

Closed gogogwwb closed 3 years ago

gogogwwb commented 3 years ago

根据介绍新建pipelines报错 {"error":"Failed to list runs.: InternalServerError: Failed to list runs: Error 1049: Unknown database 'mlpipeline': Error 1049: Unknown database 'mlpipeline'","code":13,"message":"Failed to list runs.: InternalServerError: Failed to list runs: Error 1049: Unknown database 'mlpipeline': Error 1049: Unknown database 'mlpipeline'","details":[{"@type":"type.googleapis.com/api.Error","error_message":"Internal Server Error","error_details":"Failed to list runs.: InternalServerError: Failed to list runs: Error 1049: Unknown database 'mlpipeline': Error 1049: Unknown database 'mlpipeline'"}]} 查看ml-pipeline pod日志发现问题 0531 12:44:55.893864 6 interceptor.go:29] /api.RunService/ListRuns handler starting I0531 12:44:55.894043 6 util.go:313] Getting user identity... I0531 12:44:55.894057 6 util.go:323] User: admin@example.com, ResourceAttributes: &ResourceAttributes{Namespace:kubeflow-user-example-com,Verb:list,Group:pipelines.kubeflow.org,Version:v1beta1,Resource:runs,Subresource:,Name:,} I0531 12:44:55.894073 6 util.go:324] Authorizing request... I0531 12:44:55.896315 6 util.go:331] Authorized user 'admin@example.com': &ResourceAttributes{Namespace:kubeflow-user-example-com,Verb:list,Group:pipelines.kubeflow.org,Version:v1beta1,Resource:runs,Subresource:,Name:,} E0531 12:44:55.900792 6 run_store.go:98] Failed to start transaction to list runs I0531 12:44:55.900884 6 error.go:247] Error 1049: Unknown database 'mlpipeline' InternalServerError: Failed to list runs: Error 1049: Unknown database 'mlpipeline' github.com/kubeflow/pipelines/backend/src/common/util.NewInternalServerError /go/src/github.com/kubeflow/pipelines/backend/src/common/util/error.go:142 github.com/kubeflow/pipelines/backend/src/apiserver/storage.(RunStore).ListRuns.func1 /go/src/github.com/kubeflow/pipelines/backend/src/apiserver/storage/run_store.go:82 github.com/kubeflow/pipelines/backend/src/apiserver/storage.(RunStore).ListRuns /go/src/github.com/kubeflow/pipelines/backend/src/apiserver/storage/run_store.go:99 github.com/kubeflow/pipelines/backend/src/apiserver/resource.(ResourceManager).ListRuns /go/src/github.com/kubeflow/pipelines/backend/src/apiserver/resource/resource_manager.go:441 github.com/kubeflow/pipelines/backend/src/apiserver/server.(RunServer).ListRuns /go/src/github.com/kubeflow/pipelines/backend/src/apiserver/server/run_server.go:219 github.com/kubeflow/pipelines/backend/api/go_client._RunService_ListRuns_Handler.func1 /go/src/github.com/kubeflow/pipelines/backend/api/go_client/run.pb.go:1537 main.apiServerInterceptor /go/src/github.com/kubeflow/pipelines/backend/src/apiserver/interceptor.go:30 github.com/kubeflow/pipelines/backend/api/go_client._RunService_ListRuns_Handler /go/src/github.com/kubeflow/pipelines/backend/api/go_client/run.pb.go:1539 google.golang.org/grpc.(Server).processUnaryRPC /go/pkg/mod/google.golang.org/grpc@v1.34.0/server.go:1210 google.golang.org/grpc.(Server).handleStream /go/pkg/mod/google.golang.org/grpc@v1.34.0/server.go:1533 google.golang.org/grpc.(Server).serveStreams.func1.2 /go/pkg/mod/google.golang.org/grpc@v1.34.0/server.go:871 runtime.goexit /usr/local/go/src/runtime/asm_amd64.s:1357 Failed to list runs. github.com/kubeflow/pipelines/backend/src/common/util.(UserError).wrap /go/src/github.com/kubeflow/pipelines/backend/src/common/util/error.go:240 github.com/kubeflow/pipelines/backend/src/common/util.Wrap /go/src/github.com/kubeflow/pipelines/backend/src/common/util/error.go:273 github.com/kubeflow/pipelines/backend/src/apiserver/server.(*RunServer).ListRuns /go/src/github.com/kubeflow/pipelines/backend/src/apiserver/server/run_server.go:221 github.com/kubeflow/pipelines/backend/api/go_client._RunService_ListRuns_Handler.func1 /go/src/github.com/kubeflow/pipelines/backend/api/go_client/run.pb.go:1537 请问这是什么原因呀?

shikanon commented 3 years ago

@gogogwwb mlpipeline 主要 yaml 文件在 017-pipeline-env-platform-agnostic-multi-user.yaml,你可以先删除再创建试试:

kubectl delete -f manifest1.3/017-pipeline-env-platform-agnostic-multi-user.yaml
kubectl apply -f manifest1.3/017-pipeline-env-platform-agnostic-multi-user.yaml
gogogwwb commented 3 years ago

已解决,感谢

gogogwwb commented 3 years ago

根据介绍运行一个实验时,出现镜像问题 This step is in Pending state with this message: ImagePullBackOff: Back-off pulling image "gcr.io/ml-pipeline/argoexec:v2.12.9-license-compliance" 请问有解决办法吗? 另外根据介绍运行AutoML时,出现 Internal error occurred: failed calling webhook "mutating.experiment.katib.kubeflow.org": Post https://katib-controller.kubeflow.svc:443/mutate-experiments?timeout=30s: x509: certificate signed by unknown authority

shikanon commented 3 years ago

@gogogwwb 你可以使用patch下面的pipeline-env-platform-agnostic-multi-user.yaml,在这里提供了阿里云的镜像的替换品。

kubectl delete -f patch/pipeline-env-platform-agnostic-multi-user.yaml
kubectl apply -f patch/pipeline-env-platform-agnostic-multi-user.yaml
ly574605863 commented 3 years ago

我也遇到了同样的问题,把pipeline相关东西删除然后重建之后恢复了,pipeline能正常使用并显示数据,不过后续(使用内容包括登录系统,打开notebook)会再次出现这个问题,请问大佬知道如何处理吗?

ly574605863 commented 3 years ago

我也遇到了同样的问题,把pipeline相关东西删除然后重建之后恢复了,pipeline能正常使用并显示数据,不过后续(使用内容包括登录系统,打开notebook)会再次出现这个问题,请问大佬知道如何处理吗?

手动在mysql容器里面创建了这个数据库,现在好像没问题了。。。