grycap / oscar

Open Source Serverless Computing for Data-Processing Applications
https://oscar.grycap.net/
Apache License 2.0
78 stars 15 forks source link

Project setup guide | Documentation #101

Closed rajitha1998 closed 2 years ago

rajitha1998 commented 3 years ago

Description

Hello, Grycap community! I have been trying to set up this project in my computer. However, the documentation found here only explains the OSCAR architecture and how to get started to use the OSCAR framework. It would be nice to have a guide regarding the project structure from a developers perspective to contribute to this project. Is there anything that I can refer to understand the project structure and code. Thanks :)

srisco commented 3 years ago

Hello @rajitha1998, thanks for the suggestion, we will add a section with some basic information to contribute to OSCAR's development.

By the moment I can tell you that the backend is a golang project, whose main package is in charge of reading the configuration and initialising the server. In https://pkg.go.dev/github.com/grycap/oscar#section-directories you can find some information about the different functions and structs of the packages under the pkg directory.

Moreover, the web interface is a vue.js project whose source is in the ui directory, if you need more information about it, please leave another comment mentioning @dianamariand92.

If you are interested in collaborating in the development of a new component or you have found a bug, please let us know so we can help you.

Cheers.

rajitha1998 commented 3 years ago

Hello @rajitha1998, thanks for the suggestion, we will add a section with some basic information to contribute to OSCAR's development.

By the moment I can tell you that the backend is a golang project, whose main package is in charge of reading the configuration and initialising the server. In https://pkg.go.dev/github.com/grycap/oscar#section-directories you can find some information about the different functions and structs of the packages under the pkg directory.

Moreover, the web interface is a vue.js project whose source is in the ui directory, if you need more information about it, please leave another comment mentioning @dianamariand92.

If you are interested in collaborating in the development of a new component or you have found a bug, please let us know so we can help you.

Cheers.

Thanks :)

rajitha1998 commented 3 years ago

In the OSCAR framework is the functions purely managed by the openfaas framework?

ofv1 "github.com/openfaas/faas-netes/pkg/apis/openfaas/v1"
ofclientset "github.com/openfaas/faas-netes/pkg/client/clientset/versioned"

I understand here you have imported some parts relevant to open-faas but not the entire package. Is there a reason for that? Would like to know more about how the backends are connected with the Kubernetes cluster. @srisco

rajitha1998 commented 3 years ago

Do we need to deploy open-faas separately to the kubernetes cluster if we want to Invoke a service synchronously @srisco. Still finding it hard to setup a function 😟.

srisco commented 3 years ago

Hi @rajitha1998, sorry for the late reply.

OSCAR currently supports OpenFaaS as serverless backend. By default OSCAR stores the services as PodSpecs directly using the Kubernetes API, but if OpenFaaS is installed and configured in the OSCAR server the services are defined as OpenFaaS functions, that's why its clientset is imported in our backend.

About synchronous invocations, if OpenFaaS is installed in the cluster and its parameters configured in OSCAR (SERVERLESS_BACKEND, OPENFAAS_PORT and OPENFAAS_NAMESPACE), the /run/<SERVICE_NAME> path will be available, which redirects the requests to the OpenFaaS gateway. However, if OpenFaaS is not available you can perform the processing of services asynchronously (as K8s jobs) using the /job/<SERVICE_NAME> and check the results from the /system/logs/<SERVICE_NAME>. More info about this is available in the API Documentation.

I hope this clarifies some of your doubts, I will try to answer faster in the future.

rajitha1998 commented 3 years ago

Thanks, yes this helps a lot :)

rajitha1998 commented 3 years ago

Hello @srisco, I ran into this issue while deploying this into a GKE cluster. Any idea on how to resolve this: https://stackoverflow.com/questions/66410514/creating-a-kubernetes-ingress-pointing-two-services ?

rajitha1998 commented 3 years ago

Hello @srisco, I ran into this issue while deploying this into a GKE cluster. Any idea on how to resolve this: https://stackoverflow.com/questions/66410514/creating-a-kubernetes-ingress-pointing-two-services ?

Got this problem solved :)

rajitha1998 commented 3 years ago

I successfully deployed openfaas, minio, and oscar. But I get this error when creating a service with openfaas. Creating minio storage buckets works perfectly fine :)

@srisco any idea about this?

Screenshot from 2021-03-01 11-19-05

srisco commented 3 years ago

Yes, you should deploy openfaas enabling the operator (--set operator.create=true).

rajitha1998 commented 3 years ago

Yes, you should deploy openfaas enabling the operator (--set operator.create=true).

Thanks :) , after adding that flag it sent an request but returned "The OpenFaaS Operator is not creating the service deployment". Maybe something wrong with my cluster.

For further clarification, is there a specific openfaas version that I should stick with? Also If you don't mind can you share the command that you use to install openfaas with relevant flags just to make sure I'm following the right way to setting this up @srisco.

Really sorry for troubling you so much.

rajitha1998 commented 3 years ago

This is the entire process that I have been following for now.

or

upgrade --install openfaas openfaas/openfaas --namespace openfaas --values /tmp/charts/openfaas/values.yaml --set gateway.directFunctions=false --set openfaasImagePullPolicy=IfNotPresent --set queueWorker.maxInflight=1 --set basic_auth=true --set serviceType=LoadBalancer --set queueWorker.replicas=1 --set clusterRole=false --set operator.create=true --set faasnetes.imagePullPolicy=Always --set basicAuthPlugin.replicas=1 --set gateway.replicas=1 --set ingressOperator.create=false

helm install --namespace=oscar oscar oscar --set authPass=password --set service.type=ClusterIP --set createIngress=true --set volume.storageClassName=nfs --set minIO.endpoint=https://xxx.xxx.com --set minIO.TLSVerify=false --set minIO.accessKey=minio --set minIO.secretKey=password --set serverlessBackend=openfaas

Updated Ingress:

{{- if .Values.createIngress }}
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
  name: oscar
  annotations:
    # nginx.ingress.kubernetes.io/rewrite-target: /
    kubernetes.io/ingress.class: nginx
spec:
  rules:
  - http:
      paths:
      - backend:
          serviceName: oscar
          servicePort: 8080
  - host: xxx.xxx.com
    http:
      paths:
      - backend:
          serviceName: minio
          servicePort: 9000
{{- end }}

@srisco Once I get this correctly deployed I would like to document the process for it will help the community and the grycap organization as well :)

srisco commented 3 years ago

Thank you very much for the effort.

We usually deploy openfaas through our ansible role.

Maybe something has changed in a recent version of openfaas, however, I think the error you get from the API is related to RBAC policies, due to the deployment of OSCAR before openfaas resource CRDs were created. Please try to remove the OSCAR chart and reinstall it after openfaas to check if you still get the error. If not, please let me know which version of openfaas are using so I can reproduce it and look for a solution.

rajitha1998 commented 3 years ago

Thank you very much for the effort.

We usually deploy openfaas through our ansible role.

Maybe something has changed in a recent version of openfaas, however, I think the error you get from the API is related to RBAC policies, due to the deployment of OSCAR before openfaas resource CRDs were created. Please try to remove the OSCAR chart and reinstall it after openfaas to check if you still get the error. If not, please let me know which version of openfaas are using so I can reproduce it and look for a solution.

Thank you for your response I'll try this and give you an update :)

rajitha1998 commented 3 years ago

Update: I created a fresh cluster and installed openfaas first, the last one to deploy was oscar but I'm facing the same issue @srisco. The ansible role you shared looks a bit unfamiliar for me. Since there are no instructions I'm not knowledgeable to understand it.

If I create a openFaas function through openfaas ui, should it be shown in the oscar services?

rajitha1998 commented 3 years ago

I added bug logs in the go package,

In the openfaas.go file seems like the flow does not enter this for loop @srisco

for event := range ch {
        fmt.Println("Executed 1") // NO
        deploy, ok := event.Object.(*appsv1.Deployment)
        fmt.Println("hujja deploy", deploy)
        if ok {
            if event.Type == watch.Added && deploy.Name == service.Name {
                deploymentCreated = true
                break
            }
        }
    }
srisco commented 3 years ago

It is normal to not enter in the loop as no events are received on the channel.

However, I have already found the error. I didn't realise that in the openfaas deployment you are not setting the namespace of the functions, which should be the same as the one used in the OSCAR services. Sorry for not having this documented, as we mainly deploy clusters with our ansible roles.

Setting the openfaas' functions namespace with oscar-svc should solve your problem (in the openfaas chart --set functionNamespace=oscar-svc). I seem to remember that you need to create the namespace before deploying openfaas.

rajitha1998 commented 3 years ago

It is normal to not enter in the loop as no events are received on the channel.

However, I have already found the error. I didn't realise that in the openfaas deployment you are not setting the namespace of the functions, which should be the same as the one used in the OSCAR services. Sorry for not having this documented, as we mainly deploy clusters with our ansible roles.

Setting the openfaas' functions namespace with oscar-svc should solve your problem (in the openfaas chart --set functionNamespace=oscar-svc). I seem to remember that you need to create the namespace before deploying openfaas.

Thanks a lot @srisco for looking into this. I will do a new deployment and let you know an update πŸ‘πŸ».

rajitha1998 commented 3 years ago

Yes, that worked πŸŽ‰, Functions are now getting deployed :) Thank you so much @srisco.

I was testing the example scenario of the plant classification function. When I upload an image it creates a Kubernetes job. However, the job keeps on getting stuck in the waiting state. I hope this is something related to the faas supervisor.

helm repo add kvaps https://kvaps.github.io/charts
helm install nfs-server-provisioner kvaps/nfs-server-provisioner

This is the way how I installed it so it's in the default namespace. I'll check to change its namespace to oscar or oscar-svc and give it a try. Do you have any suggestions regarding this @srisco.

Screenshot from 2021-03-04 10-29-25

Screenshot from 2021-03-04 10-30-03

Screenshot from 2021-03-04 10-30-38 Screenshot from 2021-03-04 10-31-04

rajitha1998 commented 3 years ago

Do we need to separately install the worker. I see Kubernetes jobs getting created but don't understand from where and how. Maybe the containers are waiting because the worker is missing? @srisco

srisco commented 3 years ago

The OSCAR worker is deprecated and not needed in the new version. Seems like your cluster doesn't have resources to run the job's container. Could you provide more information about your cluster, please? Instance type used, number of nodes, etc... Remember that the Kubernetes scheduler take into account the job's resource limits.

EDIT: the nfs-server-provisioner seems to work fine, as the populate-volume-job also makes use of the OSCAR PVC and it has been finished with no problems.

rajitha1998 commented 3 years ago

The cluster had only a single node. I have attached some images related to my cluster configuration. I will retry creating a new cluster with more resources πŸ‘πŸ». Very thank you for your prompt response @srisco. Screenshot from 2021-03-04 20-11-59 Screenshot from 2021-03-04 20-13-01

srisco commented 3 years ago

There is no need to use very large instances for testing purposes. Running kubectl describe pod -n oscar-svc <NAME_OF_JOB'S_POD> should show the reason why the job remains in pending status (I'm not very familiar with the GKE interface).

rajitha1998 commented 3 years ago

Cool, I'll try some more and give you an update ✌️ @srisco

rajitha1998 commented 3 years ago

After making some changes in the cluster configuration (changing OS to ubuntu from COS). The function is now getting triggered :). But here we go with another issue. This is from the faas supervisor, seems like no permission to access the minio server. But in the oscar UI I can perfectly access them (remove/add).

Screenshot from 2021-03-05 02-40-00

Packages used in faas supervisor

boto3
requests
setuptools>=40.8.0
wheel
pyyaml

Related links: https://github.com/nteract/papermill/issues/413 (since boto3 used in packages) https://stackoverflow.com/questions/36144757/aws-cli-s3-a-client-error-403-occurred-when-calling-the-headobject-operation

My cluster is located in us-central1. Even though I tried to change it, it always has the value us-east-1.

Screenshot from 2021-03-05 02-54-49 Screenshot from 2021-03-05 02-54-02

Since the configuring part is almost done, I'll start with the documentation @srisco.

rajitha1998 commented 3 years ago

The medium article: https://blog.usejournal.com/how-to-configure-oscar-in-the-google-kubernetes-engine-a-step-by-step-guide-9bf612b3fb17 @srisco.

srisco commented 3 years ago

It seems to be a problem with the MinIO API exposed on a different host through ingress, the weird thing is that the OSCAR backend does communicate well to create the buckets and enable notifications, as well as the outside access from OSCAR-ui. In the past I had problems publishing its API behind an ingress path, that's why we currently publish it directly on a nodePort, but I had never tried using a different DNS name for the host.

To fix it I can think of configuring OSCAR as in the kind instructions: setting the MinIO endpoint field to use the internal DNS name within the Kubernetes cluster (--set minIO.endpoint=http://minio.default:9000), with which the jobs should run, although the MinIO section within the OSCAR-ui wouldn't work and you would have to use the MinIO web interface.

Thank you very much for the article, I really appreciate your effort in testing OSCAR and contributing to the project.

rajitha1998 commented 3 years ago

Thank you @srisco. I'll make the changes as you said and will do a round of testing. Will give a update soon :)

rajitha1998 commented 3 years ago

Yes, that worked. Thanks a lot, @srisco :)

In the logs, it shows that the image got downloaded. However, in my Linux environment (Ubuntu 20.4) the FaaS supervisor is throwing an error to the shell script. It seems to be a bug from what I understand through the StackOverflow questions.

Screenshot from 2021-03-10 14-41-21 Screenshot from 2021-03-11 08-41-38 Screenshot from 2021-03-11 08-42-08

Related question: https://stackoverflow.com/questions/57249913/running-bash-command-with-redirection-in-python @srisco

rajitha1998 commented 3 years ago

With my understanding, I need to create a binary file from the faas-supervisor project in order to test this change. Is there a specific procedure that we need to follow when creating binaries @srisco

rajitha1998 commented 3 years ago

This file had everything I wanted to create the binary file :). After changing to bash the function execution works as it was expected @srisco πŸ‘πŸ» . If you want I can create a PR for it :) (But this issue was only on my environment as I believe)

I have a small theoretical question that I couldn't still understand. Would you mind helping me to understand it. Once we Deploy the function it gets set as deployment in Kubernetes. When we trigger the function a new Kubernetes JOB gets created with an ID. I don't understand where this JOB gets created and the place we assign the ID. It would be a great help If you can give a small explanation :)

srisco commented 3 years ago

Hi @rajitha1998,

Are you using the script we provide in the plants example? It is very tested and we haven't got that issue, however I'm going to check it using a Ubuntu 20.04 AMI. Nevertheless, the script is executed on the container, and the OS present in the machine shouldn't matter. We use sh because it is generally more available in lightweight container images.

About your question: OpenFaaS stores functions as Kubernetes deployments, when OSCAR receives an async invocation (/job/SERVICE_NAME) it reads the service definition from its configMap:

https://github.com/grycap/oscar/blob/e63063d07b7143c15bcb0ecae0a0b3fa85cfb694/pkg/handlers/job.go#L49

https://github.com/grycap/oscar/blob/e63063d07b7143c15bcb0ecae0a0b3fa85cfb694/pkg/backends/openfaas.go#L216

And creates the job:

https://github.com/grycap/oscar/blob/e63063d07b7143c15bcb0ecae0a0b3fa85cfb694/pkg/handlers/job.go#L89

OpenFaaS functions are only used to process sync invocations (/run/SERVICE_NAME), redirecting events to its gateway (that also redirects them to the k8s service associated with the deployment🀯). Indeed, the event-driven creation of Kubernetes jobs for file-processing (thanks to the MinIO integration) is the main advantage of using OSCAR, as jobs are more appropriate to perform long-running tasks.

rajitha1998 commented 3 years ago

Hi @rajitha1998,

Are you using the script we provide in the plants example? It is very tested and we haven't got that issue, however I'm going to check it using a Ubuntu 20.04 AMI. Nevertheless, the script is executed on the container, and the OS present in the machine shouldn't matter. We use sh because it is generally more available in lightweight container images.

About your question: OpenFaaS stores functions as Kubernetes deployments, when OSCAR receives an async invocation (/job/SERVICE_NAME) it reads the service definition from its configMap:

https://github.com/grycap/oscar/blob/e63063d07b7143c15bcb0ecae0a0b3fa85cfb694/pkg/handlers/job.go#L49

https://github.com/grycap/oscar/blob/e63063d07b7143c15bcb0ecae0a0b3fa85cfb694/pkg/backends/openfaas.go#L216

And creates the job:

https://github.com/grycap/oscar/blob/e63063d07b7143c15bcb0ecae0a0b3fa85cfb694/pkg/handlers/job.go#L89

OpenFaaS functions are only used to process sync invocations (/run/SERVICE_NAME), redirecting events to its gateway (that also redirects them to the k8s service associated with the deployment). Indeed, the event-driven creation of Kubernetes jobs for file-processing (thanks to the MinIO integration) is the main advantage of using OSCAR, as jobs are more appropriate to perform long-running tasks.

Yes, I was using the same script using the raw GitHub URL as you have mentioned above. One thing to mention is that due to some circumstances I switched to my local environment (Minikube "none" driver on Ubuntu 20.04). Even I was surprised by that error. Thank you very much for the detailed explanation @srisco, now I clearly understand the process of creating Kubernetes Jobs :)

srisco commented 2 years ago

Closing this. Feel free to re-open if you have any other question.