fission / fission

Fast and Simple Serverless Functions for Kubernetes
https://fission.io
Apache License 2.0
8.43k stars 786 forks source link

storagesvc start failed #3050

Open cyxr001 opened 4 weeks ago

cyxr001 commented 4 weeks ago

Fission/Kubernetes version

$ fission version
client:
  fission/core:
    BuildDate: "2024-10-04T07:43:36Z"
    GitCommit: 352090d0
    Version: v1.20.5
server:
  fission/core:
    BuildDate: "2024-10-04T07:43:36Z"
    GitCommit: 352090d0
    Version: v1.20.5

$ kubectl version
Client Version: v1.28.0
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.28.0

Kubernetes platform (e.g. Google Kubernetes Engine)

Describe the bug

I tried to remove the installation many times.

installation document is https://fission.io/docs/installation, the comand like this:

export FISSION_NAMESPACE="fission"
kubectl create namespace $FISSION_NAMESPACE
kubectl create -k "github.com/fission/fission/crds/v1?ref=v1.20.5"
helm repo add fission-charts https://fission.github.io/fission-charts/
helm repo update
helm install --version v1.20.5 --namespace $FISSION_NAMESPACE fission fission-charts/fission-all

the list comand install output is :

W1028 14:14:58.552648  261800 warnings.go:70] metadata.name: this is used in Pod names and hostnames, which can result in surprising behavior; a DNS label is recommended: [must not contain dots]
NAME: fission
LAST DEPLOYED: Mon Oct 28 14:14:55 2024
NAMESPACE: fission
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
1. Install the client CLI.

Mac:
  $ curl -Lo fission https://github.com/fission/fission/releases/download/v1.20.5/fission-v1.20.5-darwin-amd64 && chmod +x fission && sudo mv fission /usr/local/bin/

Linux:
  $ curl -Lo fission https://github.com/fission/fission/releases/download/v1.20.5/fission-v1.20.5-linux-amd64 && chmod +x fission && sudo mv fission /usr/local/bin/

Windows:
  For Windows, you can use the linux binary on WSL. Or you can download this windows executable: https://github.com/fission/fission/releases/download/v1.20.5/fission-v1.20.5-windows-amd64.exe

2. You're ready to use Fission!
  You can create fission resources in the namespace "default"

  # Create an environment
  $ fission env create --name nodejs --image fission/node-env --namespace default

  # Get a hello world
  $ curl https://raw.githubusercontent.com/fission/examples/master/nodejs/hello.js > hello.js

  # Register this function with Fission
  $ fission function create --name hello --env nodejs --code hello.js --namespace default

  # Run this function
  $ fission function test --name hello --namespace default
  Hello, world!

then I try to test nodejs , it succeed. but when I use go, it failed.

I found it maybe because storagesvc is bad. I run the fission check

fission-services
--------------------
√ executor is running fine
√ router is running fine
× storagesvc deployment is not running
√ webhook is running fine

fission-version
--------------------
√ fission is up-to-date

Then, I kubectl describe the storagesvc pod, the error is

 Warning  FailedScheduling  4m26s (x3 over 14m)  default-scheduler  0/2 nodes are available: pod has unbound immediate PersistentVolumeClaims. preemption: 0/2 nodes are available: 2 No preemption victims found for incoming pod..

this is releate pv and pvc problem. I run the kubectl get pvc -n fission, can see the fission-storage-pvc is pending.

 Normal  FailedBinding  3m22s (x62 over 18m)  persistentvolume-controller  no persistent volumes available for this claim and no storage class is set

Then I create a pv, the pvc can bound to the pv

Expected result how could I modify the pv to dvp when I deploy?

Additional context Another problem , everytime I install by helm, and the output name is fission. but when I uninstall the fission, I cannot use helm uninstall fission, it reports Error: uninstall: Release not loaded: fission: release: not found. and there is no any output when I run helm list.

so everytime I want to uninstall, I delete the fission namespace directly and delete crds stuffix with 'fission.io'. Is it ok ?

soharab-ic commented 4 weeks ago
cyxr001 commented 4 weeks ago

@soharab-ic Thanks for your response, add -n fission namespace I can see the installed fission. I think can add the explanation in the document https://fission.io/docs/installation/uninstallation/

After testing it, I have other questions to want to ask.

import ( "net/http" )

// Handler is the entry point for this fission function func Handler(w http.ResponseWriter, r *http.Request) { msg := "Hello, world!\n" w.Write([]byte(msg)) }


I have a function like this, when I run `fission function create --name aago --env go  --code ee.go  --entrypoint Handler`.
It will `function request timeout`.  And I can get source code by `fission fn get --name aago` or `fission pkg getsrc --name aago-xxxx`.
But when I run `--src ee.go`, it can run successful. But I cannot get source by the two commands above . And fission pkg getsrc just get some unreadable code.
soharab-ic commented 4 weeks ago

Sure @cyxr001, will update the document https://fission.io/docs/installation/uninstallation/

cyxr001 commented 3 weeks ago

@soharab-ic ok, pkg getsrc is ok to get .zip file. Thanks the new env yaml file also can build the new env. ()

fission-version

√ fission is up-to-date


how to debug the function request timeout problem?
soharab-ic commented 3 weeks ago

Debugging steps:

cyxr001 commented 3 weeks ago

@soharab-ic

$fission fn create --name helloworldgo --env go2 --src hello.go --entrypoint Handler Package 'helloworldgo-7fa3eff4-8d98-4815-a981-11b6c1159490' created function 'helloworldgo' created

$fission fn --name helloworldgo error executing HTTP request: Get "http://127.0.0.1:29363/fission-function/helloworldgo": function request timeout (60000000000)s exceeded

it's still `function request timeout`, and executor has a new log the same with above.
Then, I `kubectl get pods` in default namespace , there are only one pod, and isn't not the started with `poolmgr`

go2-7701179-74d9d5f86d-xgdmp 2/2 Running 0 13m

and kubect logs go2 pods:

{"level":"info","ts":"2024-10-30T06:17:37.783Z","logger":"builder","caller":"builder/builder.go:97","msg":"build request complete","elapsed_time":9.033195914} {"level":"error","ts":"2024-10-30T06:17:37.931Z","logger":"builder","caller":"builder/builder.go:89","msg":"method not allowed","http_method":"DELETE","stacktrace":"github.com/fission/fission/pkg/builder.(Builder).Handler\n\t/go/src/github.com/fission/fission/pkg/builder/builder.go:89\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2042\nnet/http.(ServeMux).ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2417\nnet/http.serverHandler.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2843\nnet/http.(*conn).serve\n\t/usr/local/go/src/net/http/server.go:1925"}


And cannot find the poolmgr pod in default or fission namespace, except for the environment I created the day before yesterday(it also created refer to the above go2, but go2 pod name does not  start with poolmgr and only one pod).
soharab-ic commented 3 weeks ago

@cyxr001 I am not able to reproduce the issue. For me, the commands you mentioned are working fine. Here, pods starting with go2-* is builder pod for go2 environment and poolmgr-go2-* pods are runtime pods.

$ kubectl get po
NAME                                                         READY   STATUS    RESTARTS         AGE
go2-7996531-576f4c6b7d-4cvnd                                 2/2     Running   0                3m21s
poolmgr-go2-default-7996531-64557ccb48-276l7                 2/2     Running   0                2m35s
poolmgr-go2-default-7996531-64557ccb48-574xl                 2/2     Running   0                3m21s
poolmgr-go2-default-7996531-64557ccb48-hmwxk                 2/2     Running   0                3m21s

$ fission fn test --name helloworldgo
Hello, world!
cyxr001 commented 3 weeks ago

@soharab-ic ok, from my side I cannot found the poolmgr-go2-* pods when I run fission fn test --name helloworldgo for router log:

2024/10/30 07:06:38 [DEBUG] POST http://executor.fission/v2/getServiceForFunction
2024/10/30 07:07:38 [ERR] POST http://executor.fission/v2/getServiceForFunction request failed: Post "http://executor.fission/v2/getServiceForFunction": function service entry timeout (60.000000)s exceeded
{"level":"error","ts":"2024-10-30T07:07:38.649Z","logger":"triggerset.http_trigger_set.helloworldgo","caller":"router/functionHandler.go:653","msg":"error from GetServiceForFunction","trace_id":"ef577b1069811d57e34e7c1a81fb4e8f","error":"error posting to getting service for function: POST http://executor.fission/v2/getServiceForFunction giving up after 1 attempt(s): context deadline exceeded","errorVerbose":"POST http://executor.fission/v2/getServiceForFunction giving up after 1 attempt(s): context deadline exceeded\nerror posting to getting service for function\ngithub.com/fission/fission/pkg/executor/client.(*client).GetServiceForFunction\n\tpkg/executor/client/client.go:94\ngithub.com/fission/fission/pkg/router.functionHandler.getServiceEntryFromExecutor\n\tpkg/router/functionHandler.go:650\ngithub.com/fission/fission/pkg/router.functionHandler.getServiceEntry\n\tpkg/router/functionHandler.go:674\ngithub.com/fission/fission/pkg/router.(*RetryingRoundTripper).RoundTrip\n\tpkg/router/functionHandler.go:222\nnet/http/httputil.(*ReverseProxy).ServeHTTP\n\t/opt/hostedtoolcache/go/1.23.1/x64/src/net/http/httputil/reverseproxy.go:481\ngithub.com/fission/fission/pkg/router.functionHandler.handler\n\tpkg/router/functionHandler.go:503\nnet/http.HandlerFunc.ServeHTTP\n\t/opt/hostedtoolcache/go/1.23.1/x64/src/net/http/server.go:2220\ngithub.com/fission/fission/pkg/utils/metrics.HTTPMetricMiddleware.func1\n\tpkg/utils/metrics/http_metrics.go:99\nnet/http.HandlerFunc.ServeHTTP\n\t/opt/hostedtoolcache/go/1.23.1/x64/src/net/http/server.go:2220\ngithub.com/gorilla/mux.(*Router).ServeHTTP\n\t/home/runner/go/pkg/mod/github.com/gorilla/mux@v1.8.1/mux.go:212\ngithub.com/fission/fission/pkg/router.(*mutableRouter).ServeHTTP\n\tpkg/router/mutablemux.go:52\ngo.opentelemetry.io/contrib/instrumentation/net/http/otelhttp.(*middleware).serveHTTP\n\t/home/runner/go/pkg/mod/go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp@v0.55.0/handler.go:177\ngo.opentelemetry.io/contrib/instrumentation/net/http/otelhttp.NewMiddleware.func1.1\n\t/home/runner/go/pkg/mod/go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp@v0.55.0/handler.go:65\nnet/http.HandlerFunc.ServeHTTP\n\t/opt/hostedtoolcache/go/1.23.1/x64/src/net/http/server.go:2220\nnet/http.serverHandler.ServeHTTP\n\t/opt/hostedtoolcache/go/1.23.1/x64/src/net/http/server.go:3210\nnet/http.(*conn).serve\n\t/opt/hostedtoolcache/go/1.23.1/x64/src/net/http/server.go:2092\nruntime.goexit\n\t/opt/hostedtoolcache/go/1.23.1/x64/src/runtime/asm_amd64.s:1700","error_message":"error posting to getting service for function: POST http://executor.fission/v2/getServiceForFunction giving up after 1 attempt(s): context deadline exceeded","function":{"namespace":"default","name":"helloworldgo"},"status_code":500,"stacktrace":"github.com/fission/fission/pkg/router.functionHandler.getServiceEntryFromExecutor\n\tpkg/router/functionHandler.go:653\ngithub.com/fission/fission/pkg/router.functionHandler.getServiceEntry\n\tpkg/router/functionHandler.go:674\ngithub.com/fission/fission/pkg/router.(*RetryingRoundTripper).RoundTrip\n\tpkg/router/functionHandler.go:222\nnet/http/httputil.(*ReverseProxy).ServeHTTP\n\t/opt/hostedtoolcache/go/1.23.1/x64/src/net/http/httputil/reverseproxy.go:481\ngithub.com/fission/fission/pkg/router.functionHandler.handler\n\tpkg/router/functionHandler.go:503\nnet/http.HandlerFunc.ServeHTTP\n\t/opt/hostedtoolcache/go/1.23.1/x64/src/net/http/server.go:2220\ngithub.com/fission/fission/pkg/utils/metrics.HTTPMetricMiddleware.func1\n\tpkg/utils/metrics/http_metrics.go:99\nnet/http.HandlerFunc.ServeHTTP\n\t/opt/hostedtoolcache/go/1.23.1/x64/src/net/http/server.go:2220\ngithub.com/gorilla/mux.(*Router).ServeHTTP\n\t/home/runner/go/pkg/mod/github.com/gorilla/mux@v1.8.1/mux.go:212\ngithub.com/fission/fission/pkg/router.(*mutableRouter).ServeHTTP\n\tpkg/router/mutablemux.go:52\ngo.opentelemetry.io/contrib/instrumentation/net/http/otelhttp.(*middleware).serveHTTP\n\t/home/runner/go/pkg/mod/go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp@v0.55.0/handler.go:177\ngo.opentelemetry.io/contrib/instrumentation/net/http/otelhttp.NewMiddleware.func1.1\n\t/home/runner/go/pkg/mod/go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp@v0.55.0/handler.go:65\nnet/http.HandlerFunc.ServeHTTP\n\t/opt/hostedtoolcache/go/1.23.1/x64/src/net/http/server.go:2220\nnet/http.serverHandler.ServeHTTP\n\t/opt/hostedtoolcache/go/1.23.1/x64/src/net/http/server.go:3210\nnet/http.(*conn).serve\n\t/opt/hostedtoolcache/go/1.23.1/x64/src/net/http/server.go:2092"}
{"level":"error","ts":"2024-10-30T07:07:38.649Z","logger":"triggerset.http_trigger_set.helloworldgo","caller":"router/functionHandler.go:743","msg":"error sending request to function","trace_id":"ef577b1069811d57e34e7c1a81fb4e8f","error":" - error posting to getting service for function: POST http://executor.fission/v2/getServiceForFunction giving up after 1 attempt(s): context deadline exceeded","function":{"namespace":"default","name":"helloworldgo"},"status":"Internal Server Error","code":500,"stacktrace":"github.com/fission/fission/pkg/router.functionHandler.handler.functionHandler.getProxyErrorHandler.func4\n\tpkg/router/functionHandler.go:743\nnet/http/httputil.(*ReverseProxy).ServeHTTP\n\t/opt/hostedtoolcache/go/1.23.1/x64/src/net/http/httputil/reverseproxy.go:486\ngithub.com/fission/fission/pkg/router.functionHandler.handler\n\tpkg/router/functionHandler.go:503\nnet/http.HandlerFunc.ServeHTTP\n\t/opt/hostedtoolcache/go/1.23.1/x64/src/net/http/server.go:2220\ngithub.com/fission/fission/pkg/utils/metrics.HTTPMetricMiddleware.func1\n\tpkg/utils/metrics/http_metrics.go:99\nnet/http.HandlerFunc.ServeHTTP\n\t/opt/hostedtoolcache/go/1.23.1/x64/src/net/http/server.go:2220\ngithub.com/gorilla/mux.(*Router).ServeHTTP\n\t/home/runner/go/pkg/mod/github.com/gorilla/mux@v1.8.1/mux.go:212\ngithub.com/fission/fission/pkg/router.(*mutableRouter).ServeHTTP\n\tpkg/router/mutablemux.go:52\ngo.opentelemetry.io/contrib/instrumentation/net/http/otelhttp.(*middleware).serveHTTP\n\t/home/runner/go/pkg/mod/go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp@v0.55.0/handler.go:177\ngo.opentelemetry.io/contrib/instrumentation/net/http/otelhttp.NewMiddleware.func1.1\n\t/home/runner/go/pkg/mod/go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp@v0.55.0/handler.go:65\nnet/http.HandlerFunc.ServeHTTP\n\t/opt/hostedtoolcache/go/1.23.1/x64/src/net/http/server.go:2220\nnet/http.serverHandler.ServeHTTP\n\t/opt/hostedtoolcache/go/1.23.1/x64/src/net/http/server.go:3210\nnet/http.(*conn).serve\n\t/opt/hostedtoolcache/go/1.23.1/x64/src/net/http/server.go:2092"}

this is executor log

{"level":"info","ts":"2024-10-30T07:06:38.650Z","logger":"generic_pool_manager.function_service_cache","caller":"fscache/functionServiceCache.go:214","msg":"Not found in Cache"}

I was fine the day before yesterday, when I create a env and function, I also can see the poolmgr-* pods.

Yesterday I after trid to use keda and redis list and build podspec failed, it cannot do anything now.

Is there anything else I can provide any information to you debugging?

I will try to uninstall and reinstall fission again.

cyxr001 commented 3 weeks ago

@soharab-ic
I have reinstalled the fission again, and it's not problem to use mq. But when I use the mq, I have some questions about it.

About consumer redis list is not process messages concurrently. https://github.com/fission/keda-connectors/blob/main/redis-http-connector/main.go#L48 But rabbitmq is. https://github.com/fission/keda-connectors/blob/main/rabbitmq-http-connector/main.go#L63

soharab-ic commented 3 weeks ago

@cyxr001

cyxr001 commented 3 weeks ago

@soharab-ic

By the way, what command you get these log files information?

logger-fm7bb fluentbit [2024/11/04 06:29:01] [ info] [input:tail:tail.0] inotify_fs_add(): inode=20670721 watch_fd=1 name=/var/log/fission/poolmgr-go-default-8210283-d68c9cbc-sfx8t_default_fetcher-ac59ab0b731fcf4744c13b06e08a5f730cd08db17b89397d370cd8dc7baf04a3.log                                                                                                                                                  
logger-fm7bb fluentbit [2024/11/04 06:29:01] [ info] [input:tail:tail.0] inotify_fs_add(): inode=20670405 watch_fd=2 name=/var/log/fission/poolmgr-go-default-8210283-d68c9cbc-sfx8t_default_go-d055cf466adb91b3627ea361ebcdda3f1ae7fc6aa7bd56987aaeb6941b513b9f.log

from `fission fn logs --dbtype='influxdb', I only can get logs like:

[2024-11-04 08:40:03.049190939 +0000 UTC] 2024-11-04T16:39:59.543952438+08:00 stderr F 2024/11/04 08:39:59 abc1

and every message is stderr. though I log it to stdout

soharab-ic commented 3 weeks ago

$ fission fn test --name hello Hello, world! $ fission fn logs --name hello --detail

=== Function=hello Environment=go Namespace=default Pod=poolmgr-go-default-8210283-d68c9cbc-m5zgb Container=go Node=kind-worker 2024/11/04 10:53:34 listening on 8888 ... 2024/11/04 11:06:30 specializing ... 2024/11/04 11:06:30 loading plugin from /userfunc/deployarchive/hello-d6eb3b96-94ba-4b8f-915d-371dcd64b6ae-e8i7qj-fl5kpp 2024/11/04 11:06:30 done

$ fission fn test --name hello Hello, world! $ fission fn logs --name hello --detail

=== Function=hello Environment=go Namespace=default Pod=poolmgr-go-default-8210283-d68c9cbc-58n5j Container=go Node=kind-worker2 2024/11/04 10:58:47 listening on 8888 ... 2024/11/04 11:06:36 specializing ... 2024/11/04 11:06:36 loading plugin from /userfunc/deployarchive/hello-d6eb3b96-94ba-4b8f-915d-371dcd64b6ae-e8i7qj-fl5kpp 2024/11/04 11:06:36 done

- `BLPop` retrieves only one message at a time.
- Logger is deployed as a daemonset. These logs belongs to logger pods.

logger-fm7bb fluentbit [2024/11/04 06:29:01] [ info] [input:tail:tail.0] inotify_fs_add(): inode=20670721 watch_fd=1 name=/var/log/fission/poolmgr-go-default-8210283-d68c9cbc-sfx8t_default_fetcher-ac59ab0b731fcf4744c13b06e08a5f730cd08db17b89397d370cd8dc7baf04a3.log
logger-fm7bb fluentbit [2024/11/04 06:29:01] [ info] [input:tail:tail.0] inotify_fs_add(): inode=20670405 watch_fd=2 name=/var/log/fission/poolmgr-go-default-8210283-d68c9cbc-sfx8t_default_go-d055cf466adb91b3627ea361ebcdda3f1ae7fc6aa7bd56987aaeb6941b513b9f.log

cyxr001 commented 3 weeks ago

@soharab-ic ok, I need to trigger the same function in an instant and generate a large number of executions. But their logs were all mixed together and couldn't be distinguished. Do you have any idea about it ?

import ( "net/http" "log" )

// Handler is the entry point for this fission function func Handler(w http.ResponseWriter, r *http.Request) { msg := "Hello, world!\n" for i := 0; i < 10 ;i ++ { log.Print("abc",i) }

    w.Write([]byte(msg))

}

soharab-ic commented 3 weeks ago

@cyxr001

cyxr001 commented 1 week ago

@soharab-ic ok, thanks for your help. I think user want to use logger to trace every executions is very normal thought. If cannot be distinguished, it's hard to debug the execution. The same problem with the async queue trigger. Is it possible to support it ?

If cannot, can close the question first.