openfaas / faas-cli

Official CLI for OpenFaaS
https://www.openfaas.com/
Other
798 stars 224 forks source link

Failing to use downloaded templates in Google Cloud Build #804

Closed Anderssorby closed 4 years ago

Anderssorby commented 4 years ago

I'm trying to build my function using a custom template on google cloud build with this config based on https://www.openfaas.com/blog/openfaas-cloudrun/

- name: 'gcr.io/$PROJECT_ID/faas-cli:0.12.3'
  dir: 'openfaas'
  args: ['faas-cli', 'template', 'pull', 'https://github.com/flexibility-org/python-flask-template'] # My custom template
- name: 'gcr.io/$PROJECT_ID/faas-cli:0.12.3'
  dir: 'openfaas'
  args: ['faas-cli', 'build', '-f', './ml-functions.yaml', '--shrinkwrap']

## Build Docker image
- name: 'gcr.io/cloud-builders/docker'
  args: ['build', '-t', 'gcr.io/$PROJECT_ID/ml-functions:$REVISION_ID', '-t', 'gcr.io/$PROJECT_ID/ml-functions:latest', '-f' ,'./build/ml-functions/Dockerfile', './build/ml-functions/']

- name: 'gcr.io/$PROJECT_ID/faas-cli:0.12.3'
  dir: 'openfaas'
  args: ['faas-cli', 'deploy', '-f', './ml-functions.yaml']

images:
- 'gcr.io/$PROJECT_ID/ml-functions'

Expected Behaviour

The function should be build, pushed to GCR and pulished on my kubernetes cluster.

Current Behaviour

I get this error

Starting Step #1
Step #1: Already have image (with digest): gcr.io/.../faas-cli:0.12.3
Step #1: 2020/04/27 12:49:46 No templates found in current directory.
Step #1: 2020/04/27 12:49:46 Attempting to expand templates from https://github.com/openfaas/templates.git
Step #1: 2020/04/27 12:49:46 Fetched 19 template(s) : [csharp csharp-armhf dockerfile go go-armhf java11 java11-vert-x java8 node node-arm64 node-armhf node12 php7 python python-armhf python3 python3-armhf python3-debian ruby] from https://github.com/openfaas/templates.git
Step #1: Pulling template: python3-http-debian from configuration file: ./ml-functions.yaml
Step #1: Fetch templates from repository: https://github.com/..../python-flask-template at master
Step #1: 2020/04/27 12:49:46 Attempting to expand templates from https://github.com/flexibility-org/python-flask-template
Step #1: 2020/04/27 12:49:46 Fetched 6 template(s) : [python27-flask python3-flask python3-flask-armhf python3-http python3-http-armhf python3-http-debian] from https://github.com/flexibility-org/python-flask-template
Step #1: [0] > Building ml-functions.
Step #1: [0] < Building ml-functions done in 0.00s.
Step #1: [0] Worker done.
Step #1: 
Step #1: Total build time: 0.00s
Step #1: Errors received during build:
Step #1: - language template: python3-http-debian not supported, build a custom Dockerfile
Step #1: 
Finished Step #1
ERROR
ERROR: build step 1 "gcr.io/$PROJECT/faas-cli:0.12.3" failed: step exited with non-zero status: 1
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

ERROR: (gcloud.builds.submit) build .... completed with status "FAILURE"

Possible Solution

It seems to me that the templates are not downloaded to the current directory and not available in the local system

Steps to Reproduce (for bugs)

  1. Create a function that uses a cutom template
  2. Run gcloud build submit with that cloudbuild.yaml

Context

I was hoping to speed up my development process by using cloudbuild to build my function (It is quite large)

Your Environment

utsavanand2 commented 4 years ago

Thanks @Anderssorby I'll try to reproduce the issue and try to get it fixed ASAP

utsavanand2 commented 4 years ago

/assign: me

alexellis commented 4 years ago

Have you been able to reproduce the issue? Any updates @utsavanand2 ?

utsavanand2 commented 4 years ago

@alexellis Hopefully soon! But not yet. I'll get on it after fixing the issue with the tests on another issue.

alexellis commented 4 years ago

@Anderssorby could you try the exact example from the blog post with the link you provided and report back? @LucasRoesler since @utsavanand2 seems to be blocked on something else, do you have any suggestions?

LucasRoesler commented 4 years ago

@Anderssorby I am not deeply familiar with google cloud build, but it seems odd that all of the stages except the build specify the directory, should this

## Build Docker image
- name: 'gcr.io/cloud-builders/docker'
  args: ['build', '-t', 'gcr.io/$PROJECT_ID/ml-functions:$REVISION_ID', '-t', 'gcr.io/$PROJECT_ID/ml-functions:latest', '-f' ,'./build/ml-functions/Dockerfile', './build/ml-functions/']

be


## Build Docker image
- name: 'gcr.io/cloud-builders/docker'
  dir: 'openfaas'
  args: ['build', '-t', 'gcr.io/$PROJECT_ID/ml-functions:$REVISION_ID', '-t', 'gcr.io/$PROJECT_ID/ml-functions:latest', '-f' ,'./build/ml-functions/Dockerfile', './build/ml-functions/']
Anderssorby commented 4 years ago

I tried adding the dir: 'openfaas', but it didn't help. I don't have time to work on this at the moment however.

utsavanand2 commented 4 years ago

@Anderssorby Can you point me to the ml-functions.yaml? Because it seems to me from the logs that the first step of the build is failing i.e., faas-cli build -f ./ml-functions.yaml. Also is this the complete cloudbuild.yaml file you're using. These questions might help me in narrowing down the problem

alexellis commented 4 years ago

@utsavanand2 please can you verify my blog post still works as expected? If it does we'll close out the issue as support, and take this via Slack.

Anderssorby commented 4 years ago

The ml-functions.yaml file:

version: 1.0
provider:
  name: openfaas
  gateway: http://***:8080
functions:
  ml-functions:
    lang: python3-http-debian
    handler: ./ml-functions
    image: gcr.io/****799914/ml-functions:latest
configuration:
  templates:
    - name: python3-http-debian
      source: https://github.com/openfaas-incubator/python-flask-template
  copy:
    - ./data
    - ./models
    - ./sklearn_pipelines

If I remember correctly I managed to do the build when not using a custom template like python3-http-debian.

utsavanand2 commented 4 years ago

I've heavily invested a lot of time for this issue now, and will continue to do so if I get more feedback from all of you. I've a Raspbery Pi K3s cluster of my own, exposed via inlets, so I was trying to build a function using the same steps as @Anderssorby but using the python-http-armhf template, upon successful run of the Cloud Build it should deploy that function to the Pi Cluster.

There are a few things that you need to change even if we get past the first step in the build (the --shrinkwrap one), including logging in with faas-cli, with credentials stored in Google Cloud KMS. You can check out this repo for how I'm decrypting the creds here https://github.com/utsavanand2/gcb

I first wanted to verify that my build configuration was even valid before testing it out on Cloud Build, so I tested locally with cloud-build-local I've listed out the problems I ran into: 1.> The first problem was when creating a new function with faas-cli new --lang python3-http-armhf hello-world created a new dir called hello-world, but apparently this dir was not accessible by cloud-build-local because it didn't have permission to access the dir, I had to manually chmod -R the dir and its contents to give additional rights so that cloud-build could have access to it.

2.> Similarly, faas-cli deploy -f hello-world.yml --shrinkwrap created a build directory that didn't give rights to other users as cloud-build-local could not access it. I had to chmod both the build and hello-world dirs.

3.> The next problem was when doing a faas-cli login, it tried to create a .openfaas folder in the home dir, which is not accessible in the cloud-build-local and also apparently in the Cloud Build env, causing the build to fail.

Screenshot 2020-05-16 at 6 43 55 PM

4.> But the local build was at least even successful to 5 steps into the build, but doing the same in Cloud Build was no fun because I started getting the same errors as mentioned by @Anderssorby

Screenshot 2020-05-16 at 6 53 44 PM

Would love your feedback on it @alexellis @martindekov

alexellis commented 4 years ago

Please can you try my blog post as written and let us know if it works or not @utsavanand2? That was the main ask. Thanks

utsavanand2 commented 4 years ago

Please can you try my blog post as written and let us know if it works or not @utsavanand2?

@alexellis After fixing the openfaas as the provider instead of faas in stack.yaml I got the same error.

Screenshot 2020-05-16 at 7 12 15 PM
alexellis commented 4 years ago

So do we think that Google made breaking changes to the way cloud build works re: access permissions?

alexellis commented 4 years ago

Or did our CLI's code change since the original post? Git blame and the post's date might help you narrow it down.

utsavanand2 commented 4 years ago

Update: using echo $$OPENFAAS_PASSWD | faas-cli login --password-stdin got rid of the error mentioned in point 3.> in my comment. Also we can get rid of the errors of creating a build dir if we define our build step in a specific dir like:

- id: "Do a shrinkwrap build"
  name: 'gcr.io/$PROJECT_ID/faas-cli:0.12.4'
  entrypoint: 'sh'
  dir: 'openfaas'
  args: ['-c', 'faas-cli build -f ../hello.yml --shrinkwrap']

But as soon as you get rid of this error, we get the previous error, Step #1 - "Do a shrinkwrap build": - language template: python3-http-armhf not supported, build a custom Dockerfile

utsavanand2 commented 4 years ago

Okay So I got the build to run successfully! Here's my configuration: I am using a volume to persist the changes between steps:

options:
  volumes:
    - name: 'buildvol'
      path: '/openfaas'

I thought maybe using a volume would fix the problems with the permission but it was no good.

So I focused on the Dockerfile of faas-cli, and noticed that we had a change back in January where we took off faas-cli root rights, and ran it as a normal user. So, I reverted those changes on the fork of faas-cli repo and gave back the admin rights, build the new Docker image, and pushed it to container registry.

Dockerfile's release stage changes:

# Release stage
FROM alpine:3.11 as release
RUN apk --no-cache add ca-certificates git
WORKDIR /root/
COPY --from=builder /go/src/github.com/openfaas/faas-cli/faas-cli               /usr/bin/
ENV PATH=$PATH:/usr/bin/
CMD ["faas-cli"]

And this fixed the failing build issue.

Screenshot 2020-05-19 at 12 41 18 PM

But I see that @Anderssorby wants to deploy the function to a GKE OpenFaas deployment. I have the login and deployment step in my build as well, but for some reason the function doesn't get deployed, even after we have logged in successfully, and we don't get any errors as well.

Screenshot 2020-05-19 at 12 51 42 PM
alexellis commented 4 years ago

Thank you for looking into this @utsavanand2 - it looks like you’ve traced this down to the root cause. We moved to an image that runs as a non-root user, and the flow with Cloud Build doesn’t seem to support that.

The suggestion we came up cc @LucasRoesler with here is to release a new Docker image via the faas-cli build which publishes an image tag :latest as used here, which runs as non-root and therefore doesn’t play well with Cloud Build. The change will publish :latest-root which can be used by Cloud Build users without any messing about.

@utsavanand2 can you submit a PR to the publishing / Travis code to push an image with another Dockerfile that derives from the openfaas/faas-cli:latest tag and simply adds a USER root?

utsavanand2 commented 4 years ago

@Anderssorby @LucasRoesler @alexellis I've been able to get Cloud Build to work with faas-cli successfully, also ended up spending a lot on GKE in the process. I'll put together a PR now that adds a Dockerfile.root Dockerfile that would be useful for CI/CD environments. If someone is tempted to take a look at the repo and the cloudbuild.yaml configuration you can take a look at it here https://github.com/utsavanand2/gcb

Screenshot 2020-06-04 at 5 21 56 PM
LucasRoesler commented 4 years ago

@utsavanand2 we might be able to use just a single Dockerfile if we use a build arg to set the user at the end of the docker file, this would allow us to build the same image twice, but he second time with --build-args user=root

utsavanand2 commented 4 years ago

@LucasRoesler this is a really awesome idea! Although I'll have to look up if I can get Dockerfile commands like WORKDIR and RUN under the if else blocks. But this would be really the most optimal solution IMHO. I would also love to know @alexellis opinions on this.

utsavanand2 commented 4 years ago

@Anderssorby Did you get to try out the new faas-cli root image? I would recommend you to check out this repo for the configuration needed in cloudbuld.yaml as we're using ENTRYPOINT instead of CMD.

I am assuming this did fix your issue, feel free to reopen it if your problem persists.

utsavanand2 commented 4 years ago

Derek close