Closed alexellis closed 4 years ago
Initial thought is that faas-cli build
could look for a standardised location ../common/
and pull the assets into the temporary build folder before heading into the bread and butter of build
.
I think this is a great idea. A standard folder at the project root would be simple and still pretty flexible.
Alternatively, we could promise to build the function from the project root so that the docker context is the entire project and they can then use the COPY commands in the dockerfile to control what is copied.
I agree with @rgee0 that a "pre-build" step could be used to copy a desired directory's files into the requested function's directory
Just throwing out some thoughts...
The tricky part might be what to do with the function's Dockerfile. If it's a user-built function, they could expect those files to be there and add the COPY
command, but I'd assume this should also work for templates.
There could be an empty directory with the template so that the template could put files in a function/fn1/common
directory and the Dockerfile would then just COPY ./common ./
to include the files if they're there, but wouldn't break if they weren't
Would the cli just always copy the common files from a /common/
directory? I'd think you'd want to control that to avoid sharing common files that shouldn't be shared in all the functions. Perhaps a flag on the build
command?
faas-cli build -f fn1.yml --include-common
Or, maybe a property in the function's yml?
provider:
name: faas
gateway: http://localhost:8080
functions:
tester:
lang: node
handler: ./tester
image: tester:0.1
common_files: true
Yes, the provision of a flag is a must.
On the Dockerfile question, I like the empty folder idea. We could also add a Dockerfile.common to the templates and switch which to copy based on the flag provision.
I would be against a flag. It is far simpler to implement and use if there is a single strong default (i.e. always include common files in the build context). It is easier for developers (of templates and functions) to predict and debug what is going to happen and therefore more likely to be a successful build.
The dockerfile author can easily choose to copy or ignore files from the context, we are not forcing them to copy the files in to the resulting image, so it is somewhat harmless to always include the common files in all build contexts.
The biggest downside to always including files in the context is that the context is bigger and the build slower. But I don't think we can control this.
I would say that this should be an opt-in feature, so that no backward incompatible changes are introduced. Lets say users are already having a common
folder and managing copy operations by themselfes, their build artifacts would change if this would become the default behaviour.
Opt-in feature is greate to maintain compatibility as you said, but to have many posibilities make it hard to maintain and provide support from maintainers' side. I would prefer the suggestion of @LucasRoesler
I guess there could possibly be a naming conflict when copying a folder into the build context, but I am not sure I see another incompatibility with the current build process. Maybe we can allow them to provide an alternative name?
If we are creating the build context in a tmp folder and use that tmp folder as the Docker build context with the function folder as a subfolder, this would avoid naming collisions because the folder starts empty. Of course, this would not be compatible with the current process. Roughly it would look like
build_context
|-- awesomefunc
| |-- Dockerfile
| `-- handler
`-- common
|-- __init__.py
`-- privateutil.py
Where build_context
is a tmp folder that we create and copy awesomefunc
and common
into. The build process would basically be docker build -f ./build_context/awesomefunc/Dockerfile ./build_context
. Doing this, we own the build context going forward and can avoid conflicts.
On the other hand, if we are going to build in a context that is not the function folder, then the simplest thing is to just build from the project root and let the developer manage their own files without needing to worry about files being moved or named certain things by faas-cli
.
Perhaps the most flexible thing would be to allow the developer to just specify the build context for docker? provide a path relative to the project root or null, if null then we use the function folder. This is very flexible but incredibly hard to predict for templates.
It seems to be that the trade-off is flexibility versus predictability. For a build process, I think I would prefer to see consistent and predictable builds.
We can also introduce a folder common
when a new function is created, in which we put a OPENFAAS_RESERVED.md
file to tell user how this folder works, this folder will be filled with content of common folder at root dir of yaml file, if there is one.
During build process, if we see a common
directory with OPENFAAS_RESERVED.md
in func dir and there is a common
at root, we then copy common files
For current functions having this common
dir without the MD file, nothing changes for them
Or we also just keep the actual process and just check if there is already a common directory inside the function dir, if there is, we skip copying and show a warning to user, this is much simpler, but user need to read carefully outputs of build process
function_stack
|-- fn1
| |-- common
| | `-- OPENFAAS_RESERVED.md
| `-- handler
|-- fn2
| |-- common
| | `-- OPENFAAS_RESERVED.md
| `-- handler
|-- common
| |-- __init__.py
| `-- privateutil.py
`-- stack.yml
In addition, I think it would be better to have a configuration field in yaml, to choose which sub dir in common
to be copied:
provider:
name: faas
gateway: http://localhost:8080
functions:
fn1:
lang: python3
copy: #common dir which is at the same level as yaml file
- debug
- privateutil.py
fn2:
lang: python3
copy: #common dir which is at the same level as yaml file
- . # all common dir
fn3:
lang: python3
copy: #no copy
Heading back to the flag/default discussion. The issue I have with including common
by default and without a flag is that the user is not fully in control of their actions. There is an assumption on the part of the CLI that anything in the chosen standard location is intended for inclusion in the function image. This to me doesn't feel right.
Is the only objection to this being that existing users would be surprised by seeing a common
folder suddenly show up in functions created with faas-cli new
? I think more people might be happy to see a new feature.
Derek add label: discussion
One other crazy idea I just had, we can utilize multi-stage builds to allow people to copy from a "faas-common" image. I got this from https://docs.docker.com/develop/develop-images/multistage-build/#use-an-external-image-as-a-stage
Basically, the cli can build an image "faas-commons" that is just SCRATCH plus a COPY
of the common folder, then in the function image they can
COPY --from=faas-commons /common/foo.txt /foo.txt
We could use build args to make the image name dynamic. This would look like
ARG FAAS_COMMONS_IMAGE
...
COPY --from=$FAAS_COMMONS_IMAGE /common/foo.txt /foo.txt
One interesting thing about this, we could even allow the developer to specify a Dockerfile to use for the faas-commons
image. This gives them a lot of control.
The benefits I see from this:
A couple of negatives from this:
I Really like the idea of @LucasRoesler.
For concern 2:
we can build the common image from fass-cli with tag <function-image-name>-common
and pass it as FAAS_COMMONS_IMAGE
.
So as I understand the idea propose to add the below part in their Dockerfile
to copy the common context:
# in the base image
ARG FAAS_COMMONS_IMAGE=faas-common
FROM $FAAS_COMMONS_IMAGE as common
...
# in the end stage
COPY --from=common /common /common
faas-common
can be a fixed image with blank /common
dir; so that the build doesn't fail in case FAAS_COMMONS_IMAGE
isn't provided. In case user doesn't want to include the common FAAS_COMMONS_IMAGE
wont be provided and /common
will be copied from default
Considering @rgee0 points,
I personally like the idea of providing --include-common
flag as it can allow user to quickly include/exclude common items without changing the Dockerfile
. but user need to be aware of the fact that /common
dir can be empty based on the flag.
We can add above part in templates as well, considering the above part goes into all the template file before the last stage, i.e. for go
:
FROM golang:1.8.3-alpine3.6 as builder
RUN apk --no-cache add curl \
&& echo "Pulling watchdog binary from Github." \
&& curl -sSL https://github.com/openfaas/faas/releases/download/0.7.6/fwatchdog > /usr/bin/fwatchdog \
&& chmod +x /usr/bin/fwatchdog \
&& apk del curl --no-cache
WORKDIR /go/src/handler
COPY . .
# Run a gofmt and exclude all vendored code.
RUN test -z "$(gofmt -l $(find . -type f -name '*.go' -not -path "./vendor/*" -not -path "./function/vendor/*"))" || { echo "Run \"gofmt -s -w\" on your Golang code"; exit 1; }
RUN CGO_ENABLED=0 GOOS=linux \
go build --ldflags "-s -w" -a -installsuffix cgo -o handler . && \
go test $(go list ./... | grep -v /vendor/) -cover
## *****************************************
ARG FAAS_COMMONS_IMAGE=faas-common
FROM $FAAS_COMMONS_IMAGE as common
##*******************************************
FROM alpine:3.6
RUN apk --no-cache add ca-certificates
## *****************************************
COPY --from=common /common /common
## *****************************************
# Add non root user
RUN addgroup -S app && adduser -S -g app app
RUN mkdir -p /home/app
RUN chown app /home/app
WORKDIR /home/app
COPY --from=builder /go/src/handler/handler .
COPY --from=builder /usr/bin/fwatchdog .
USER app
ENV fprocess="./handler"
CMD ["./fwatchdog"]
or for dockerfile
:
## *****************************************
ARG FAAS_COMMONS_IMAGE=faas-common
FROM $FAAS_COMMONS_IMAGE as common
##*******************************************
FROM alpine:3.6
# 1. Use any image as your base image, or "scratch"
# 2. Add fwatchdog binary via https://github.com/openfaas/faas/releases/
# 3. Then set fprocess to the process you want to invoke per request - i.e. "cat" or "my_binary"
#ADD https://github.com/openfaas/faas/releases/download/0.7.1/fwatchdog /usr/bin
#RUN chmod +x /usr/bin/fwatchdog
## *****************************************
COPY --from=common /common /common
## *****************************************
RUN apk --no-cache add curl \
&& echo "Pulling watchdog binary from Github." \
&& curl -sSL https://github.com/openfaas/faas/releases/download/0.7.6/fwatchdog > /usr/bin/fwatchdog \
&& chmod +x /usr/bin/fwatchdog \
&& apk del curl --no-cache
# Populate example here - i.e. "cat", "sha512sum" or "node index.js"
ENV fprocess="cat"
# Set to true to see request in function logs
ENV write_debug="false"
HEALTHCHECK --interval=5s CMD [ -e /tmp/.lock ] || exit 1
CMD [ "fwatchdog" ]
Personally I would like to work on this implementation if it is fixed :)
We've actually hit this issue ourselves. The chosen solution was to write a shell wrapper to copy common
into the function directory before calling build & push and then deleting common
from the function directory in order to keep things clean.
@rgee0 when you copy common directory into the function directory it requires changing the relative path in the handler, for example from (../common) to (./common) which creates problems when you want to locally debug the function.
How do you handle that?
@kobynet it depends how you perform the separation of the common parts. We kept the handler within each function, extracted the common code into common
then had the handler code expect to find a common
directory from which to import. This means that during development / debugging the common
directory exists within the function dir and be moved out when you're happy.
A possible solution would be to have a structure like this one:
functions:
hello-python:
lang: python
context: ./hello-python
image: hello-python
handlers:
- fn1.py
- fn2.py
The handlers
list would be optional and be set to [ handler.py ]
by default. Generated docker images would be hello-python_function1
and hello-python_function2
.
The drawback is that you would have multiple functions in the same folder which may not be the vision of OpenFaaS.
EDIT
Or maybe,
functions:
fn1:
lang: python
context: ./hello-python
image: fn1
handler: fn1.py
fn2:
lang: python
context: ./hello-python
image: fn2
handler: fn2.py
There would always be 2 handlers in the same folder, but functions would be treated separately in .yml
file.
I am going to poke at this design again because this is something one of our teams really want.
My proposal is to add a new property/flag to the build copy_additional_paths
. This is a list of strings that are relative paths within the project that will be copied into the top level of the functions docker build context, which then allows the function/template author to decide how to copy them into the final docker image.
It is important to note that:
/
or with ../
. This protects the build process from copying system filesI have created a prototype implementation here https://github.com/openfaas/faas-cli/compare/master...LucasRoesler:feature-copy-common
Start with a function project with this layout
.
├── Gopkg.lock
├── Gopkg.toml
├── README.md
├── common
│ └── foo.txt
├── go-echo
│ └── handler.go
├── one_more_thing.txt
├── stack.yml
└── vendor
└── ....
The initial stack.yml file would look like
functions:
go-echo:
lang: golang-http
handler: ./go-echo
image: alexellis2/go-echo
environment:
write_debug: true
But if I want to include the common
folder and the one_more_thing.txt
in the build context for the go-echo
function, I would update it to look like
functions:
go-echo:
lang: golang-http
handler: ./go-echo
image: alexellis2/go-echo
environment:
write_debug: true
copy_additional_paths:
- "common"
- "one_more_thing.txt"
Then when I run faas-cli build
, the resulting build folder will look like
./build
└── go-echo
├── Dockerfile
├── Gopkg.lock
├── Gopkg.toml
├── common
│ └── foo.txt
├── function
│ └── handler.go
├── main.go
├── one_more_thing.txt
├── template.yml
└── vendor
└── ...
I am still to be convinced that this is a good idea, copying in files from a sub-directory is leaky and could lead to undefined behaviour.
@stefanprodan rightly said that this job can be achieved with a shell script wrapper, or a basic task runner like grunt or whatever the JavaScript world is using now.
https://stackoverflow.com/questions/18966485/copy-all-files-from-directory-to-another-with-grunt-js-copy https://github.com/gruntjs/grunt-contrib-copy
If at all, I think any configuration like this belongs in the function/build_options segment of YAML.
I.e. something like this:
Data-scientist who should probably put his model in an S3 bucket or similar storage.
Here he keeps it in a common folder, one level up. The copy happens during the shrinkwrap stage (preparation of the build folder).
functions:
model_reader:
handler: ./model_reader
build_options:
copy:
- from: ../model/training.dat
to: ./
model_analysis:
handler: ./model_analysis
build_options:
copy:
- from: ../model/training.dat
to: ./
Or for JavaScript programmers who should really be using an npm module:
functions:
checkout:
handler: ./checkout
build_options:
copy:
- from: ../common/data-access
to: ./
cart:
handler: ./cart
build_options:
copy:
- from: ../common/data-access
to: ./
They should publish an npm package named data-access
and then add it to package.json
.
I would like to see an alternative from @LucasRoesler for the above, who asked for this feature recently to compare the approaches:
1) Makefile 2) bash 3) grunt or similar task runner
So that we can compare the pros/cons.
I will create an example using make and one for bash, but i will not use grunt, it is not a tool I would recommend adding to a project just to copy some folders around. But bash and make tend to be very common on developer system.
Can you explain leaky?
With respect to build_options
. Unfortunately, build_options
can't be extended like this very easily , it is already defined a string slice and the current implementations in the CLI and templates are strictly for defining groups of extra packages to install. even the doc string on the stack schema hints at this limited use, // BuildOptions to determine native packages
, which when you follow the implementation and how it is used, ultimately is only used to populated the ADDITIONAL_PACKAGE
docker build arg
if len(buildOptionPackages) > 0 {
buildOptionPackages = deDuplicate(buildOptionPackages)
spaceSafeBuildFlags = append(spaceSafeBuildFlags, "--build-arg", fmt.Sprintf("%s=%s", AdditionalPackageBuildArg, strings.Join(buildOptionPackages, " ")))
}
If we decide to implement this feature, it will either require a breaking change on build_options
or a new option like I am proposing.
One immediate con I can see to using an external tool for this step would then mean that users that need this functionality could not use the faas-cli up
command, for example. It would need to be re-implemented in the scripting for those projects.
It also means that any one that wanted to build templates around this kind of functionality would not be able to easily share those templates. There would be no way to for the community at large to extend this concept and build on it. It would remain isolated to particular implementations on each team and whatever tool and flow they decided to implement.
This is why I am not convinced that "this can be done with another tool" is necessarily a great argument. faas-cli deploy
could be faas-cli build && faas-cli deploy
but it was made because the developer experience is simply better when faas-cli up
exists. If copying/sharing a folder between functions is common, then directly supporting that experience will make it much better. If we don't think that this is a common use case, then adding it might not be worth it.
I also feel that your example with javascript and npm ignores my statement about packaging being an overhead that can be (1) complex in some languages (python) or (2) significant for small utility helper packages that might not make sense as a proper package or used by other projects. For example, you might not want to split the common folder into a separate package for the same reasons you would not want split out each go sub-package defined in faas-cli
into separate repos.
@alexellis here are some alternative methods using make and bash
.PHONY: build
build:
faas-cli build --shrinkwrap
# maybe i can try to parse the stack file for the name of the functions and iterate here?
cp -R ./common ./build/<function_name>/
docker build -t <prefix>/<function_name> ./build/<function_name>
Biggest downsides:
faas-cli
.FUNCTION_FILTER=""
.PHONY: build
build:
faas-cli build --shrinkwrap --filter="${.FUNCTION_FILTER}
# maybe i can try to parse the stack file for the name of the functions, then apply the filter, and iterate?
cp -R ./common ./build/<function_name>
docker build -t <prefix>/<function_name> ./build/<function_name>
usage would look like
➜ echo-fn git:(master) ✗ make build
faas-cli build --shrinkwrap --filter=""""
[0] > Building go-echo.
Clearing temporary build folder: ./build/go-echo/
Preparing ./go-echo/ ./build/go-echo//function
Building: alexellis2/go-echo:latest with golang-http template. Please wait..
go-echo shrink-wrapped to ./build/go-echo/
[0] < Building go-echo done.
[0] worker done.
cp -R ./common ./build/go-echo
➜ echo-fn git:(master) ✗ make build .FUNCTION_FILTER=foo
faas-cli build --shrinkwrap --filter="foo"
No functions matching --filter/--regex were found in the YAML file
make: *** [build] Error 1
#! /usr/bin/env bash
faas-cli build --shrinkwrap $@
# maybe i can try to parse the stack file for the name of the functions and iterate?
cp -R ./common ./build/<function_name>
docker build -t <prefix>/<function_name> ./build/<function_name>
This script is simpler than the Makefile
because we can blanket pass extra args to faas-cli
via the $@
. However, it still has the hard coded function name
Biggest Downsides:
One more option that I came up with is create custom template with the common code. The flow would be to
Biggest downsides:
After thinking about the "Custom Template" approach, I came up with this alternative bash approach that copies the common folder into the template prior to being built.
This script is also more dynamic that the others
#! /usr/bin/env bash
for d in ./template/*;
do
cp -R ./common "$d"
done
faas-cli build --shrinkwrap $@
for d in ./build/*;
do
echo "Building ${d:8}"
docker build -t <prefix>/${d:8}:latest "$d"
done
Biggest downsides:
common
while C and D need to share models
", for example. I am fairly comfortable with Bash, so, I would probably advocate for the final option, if someone was going to pursue this direction. It is the most flexible for being directly copied into a new project
Thank you for providing this, very helpful 👍
The "Biggest downsides" is interesting, what are the biggest upsides?
The theme appears to be that the "Biggest downside" is that using a Makefile means the various arguments are hard to parse if and when needed beyond what's hard-coded in the Makefile?
An off the wall idea - I think the pain point your team had was copying a common model into each folder for their functions.
Could that model be published to npm
or as a pip
in a public or private repository?
If the strongest case was for a change to the CLI, how far do you want to allow the user to traverse outside of the build context to copy in files?
How do we protect from someone putting /etc/password
or ../../
or similar to copy system files in from build machines?
What about protection on OpenFaaS cloud where the assumption is the entire function is within the current folder for git-tar? https://github.com/openfaas/openfaas-cloud/blob/master/gitlab-push/handler.go
I'd welcome your thoughts.
In my PoC branch I restrict copying to same folder or below of the stack file. I think it is reasonable that you can only copy files that are part of the project
Publishing artifacts to pypi or some other location is especially annoying for python and it means that building requires a push and a pull of something that is essentially local. this makes for a very slow feedback loop in local development and testing. Also, during local development and testing, you will not want to actually publish that model or folder.
If the project is private, it also means that to build the project you will probably need to pass credentials to the build process, which is notoriously problematic and easy to do incorrectly and result in credentials in the final image.
The developer could delay the pull of the external resource until function start, but this makes the deployment and cluster setup more fragile because it could be easy to misconfigure the host or credentials for this external repository of stuff. This also makes local testing on a laptop much hard to setup because the cluster inherently is more complicated to deploy, it now needs access to this external repository of stuff
Just to make it easier to find, I have implemented a prototype solution in faas-cli here https://github.com/openfaas/faas-cli/compare/master...LucasRoesler:feature-copy-common
RE: the PoC - https://github.com/openfaas/faas/blob/master/CONTRIBUTING.md#setting-expectations-support-and-slas
Could the --copy-additional-paths
flag be --copy-extra
and still be workable? I feel the verbosity is descriptive, but hard to type in.
As for adding this to each function in the YAML file, I think I'd rather see it in the common build/config section we're discussing adding in #668
configuration:
templates:
- name: node10-express
# no URL means fetch from URL in the store.
- name: rust
url: https://github.com/openfaas-incubator/openfaas-rust-template
copy-common:
- ./models/
- /etc/passwd
- /users/aellis/.docker/config.json
Why? to stay true to the original ask: I want a copy of X in every folder, and also to decouple the function definition from the build configuration a little further. Does that cover the use-case for your team?
If it would work for you as above, please feel free to work on the PR, but add the unit test cases to show that we are blocking copying from higher levels of the filesystem and C:/windows/system32
and such like. Your PR blocks ../
, but does it block /etc/passwd
? We should be able to show that with a test or two.
/set milestone: 0.9.5
Some users have asked for a common folder "X" to be copied into every one of their functions at build time.
i.e.
In the scenario above - the user wants "common_folder" to be copied into or to be accessible within fn1, fn2 and fn3 and also expects the CLI to do that.
Possible workaround: have user make use of a Makefile or batch file to copy functions into a temporary folder and build out the structure the way they want.
@stefanprodan suggested using "grunt" or a JavaScript task runner just like when you build a react website from a "src" folder into a "dest" folder.