TankLabTJU / INFless

The source code of INFless,a native serverless platform for AI inference.
GNU General Public License v3.0
33 stars 13 forks source link

Pod called gatewaydev is stuck in CrashLoopBackOff mode #2

Open linhl1218 opened 1 year ago

linhl1218 commented 1 year ago

After deploy INFless into kubernetes cluster by using kubectl apply -f yaml/inuse, and using kubectl get pods -n openfaasdev, I found the following problem.

NAME READY STATUS RESTARTS AGE
basic-auth-plugindev-7648576d9d-sc96j 1/1 Running 3 3d23h
gatewaydev-64566bc769-g9jvw 1/2 CrashLoopBackOff 1076 3d23h
prometheusdev-557d96c5c5-qzz9s 1/1 Running 3 3d23h

Then I get the following log about the container called 'faas-netesdev' on the error pod.

09:25:37.874958 repository: read clusterCapConfig.yml successfully
09:25:37.874972 open ./yaml/profiler/resnet-50-profile-results.txt: no such file or directory
panic: runtime error: index out of range [1] with length 1

goroutine 1 [running]:
github.com/openfaas/faas-netes/gpu/controller.initModel(0xc00023cdb0, 0x1357790, 0x9, 0x137a7b6, 0x2d, 0xc00023cd80)
    /go/src/github.com/openfaas/faas-netes/gpu/controller/estimator.go:55 +0x6c6
github.com/openfaas/faas-netes/gpu/controller.InitProfiler()
    /go/src/github.com/openfaas/faas-netes/gpu/controller/estimator.go:35 +0x132
main.main()
    /go/src/github.com/openfaas/faas-netes/main.go:171 +0x1b9

How to generate resnet-50-profile-results.txt ?

linhl1218 commented 1 year ago

In faas-netes, it seems to have many errors,such as unable to listen, response to the login by faas-cli.