openfaas / faas

OpenFaaS - Serverless Functions Made Simple
https://www.openfaas.com
MIT License
25.07k stars 1.93k forks source link

Proposal: Adding mount options to functions. #320

Closed alshabib closed 6 years ago

alshabib commented 6 years ago

The proposed change would allow functions to mount volumes and other directories through the normal docker configuration. This would allow a function to process relatively large amounts of data without having to pass it through http/stdin.

Any design changes

Add docker mount struct to the CreateFunctionRequest struct and passing it along in the create function handler

Pros + Cons

Pros:

Cons:

Effort required Little, it's a two line change.

rgee0 commented 6 years ago

Derek add label: proposal

ericstoekl commented 6 years ago

Personally I think this is a fantastic idea. @alshabib , do you have any specific use-cases in mind that would necessitate this functionality? I'm thinking maybe a database (like MongoDB) as a function...?

alshabib commented 6 years ago

Thanks!

For example, for the process cleaning large amounts of data it is simpler for a function to read the raw data from a volume and rewrite the processed data to the same or other volume.

The database use case is an interesting one, especially if you want to scale the number of readers to a database easily.

-- Ali Al-Shabibi

On Oct 24, 2017, 7:32 PM +0200, Eric Stoekl notifications@github.com, wrote:

Personally I think this is a fantastic idea. @alshabib , do you have any specific use-cases in mind that would necessitate this functionality? I'm thinking maybe a database (like MongoDB) as a function...? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

alexellis commented 6 years ago

We might revisit this in the future but I think it is an anti-pattern for functions which are short-lived and stateless. This will encourage stateful behavior and assumptions.

alshabib commented 6 years ago

I agree that this feature may encourage bad behavior, but then again you cannot stop people from shooting themselves in the foot.

Agree also that functions should be stateless and short-lived but that does not mean that they will not consume or emit large amounts of data and there is no reason why the volume of this data should be limited by the http session. This is simply an alternative method to providing input to a function.

Would you prefer an option to openfaas which would disable this feature rather than not providing it entirely?

dexterg commented 6 years ago

I am wrinting a ruby function that find an IP address in configuration files (firewall, proxy, bigip etc) There is a night cron process (too long for faas) that download these files with (scp, expect etc.)

That's a good use case for a mount volume fonctionnality no ?

Toggi3 commented 6 years ago

Feedback (take it or leave it): This is the only missing feature that prevented me from deploying this system for functions that handle batches of file pulls/pushes, translations, scraping. It's really an amazing framework, but I have to often deal with very large flows of handling files gathered from abc protocols due to xyz legal or contractual obligations. A serverless function system like this with the ability to have any kind of volume support would be pretty helpful. I really want full blown compose functionality with regards to volumes and network.

File access is way more reasonable for this kind of thing.

I hope you guys reconsider implementing such. I would be interested in what workarounds I can apply to achieve the result of bind mounting a specific fixed directory on all swarm workers in the cluster. I might just deploy with the related patch to solve my problem. Is there any other way to solve it? A service on the same network maybe?

alexellis commented 6 years ago

Hi I'd like to know more about your usecase. Do you have any specifics? Functions are not supposed to be stateful - it breaks the design, however I think you should try Minio object storage for your storage needs - makes sense to explore the recommended approach for the project/architecture.

Toggi3 commented 6 years ago

So, I'll give you one such function, say you have to transcode/transcribe a proprietary audio format designed for storing call center interactions. You need to extract from it the audio payload, make it into something that can be understood by a voice transcription engine, as well as extract the other data and store it in a way that it can be digested for its analytics value, and place both of those things in a place where they can be picked up by another process later. I do a lot of batch information pulling/pushing/translating/feeding/scraping across many files for a lot of clients, but occasionally patterns emerge where I'd love to be able to develop a nice parallel function that I can feed variables and a list of files to do work like decrypting these multiple batches of 50000 audio files, dropped on some FTP site on storage we own and can export to our container host machines...

I'd rather not do such a thing through a layer like S3, I need something at least slightly faster, and NFS is simple.

alshabib commented 6 years ago

I think that in general the advantage of such a feature would not be to allow a function to store state but rather enable a function to process larger data volumes. Of course, it would be hard to prevent a use from doing the former, but I guess you can only put up so many guard rails.

On January 13, 2018 at 6:03:15 PM, Toggi3 (notifications@github.com) wrote:

So, I'll give you one such function, say you have to transcode/transcribe a proprietary audio format designed for storing call center interactions. You need to extract from it the audio payload, make it into something that can be understood by a voice transcription engine, as well as extract the other data and store it in a way that it can be digested for its analytics value, and place both of those things in a place where they can be picked up by another process later. I do a lot of batch information pulling/pushing/translating/feeding/scraping across many files for a lot of clients, but occasionally patterns emerge where I'd love to be able to develop a nice parallel function that I can feed variables and a list of files to do work like decrypting this batch of 5000 audio files.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/openfaas/faas/issues/320#issuecomment-357477480, or mute the thread https://github.com/notifications/unsubscribe-auth/AA5rrKFySN-yETyoymQm7XgEjyimy6DNks5tKURCgaJpZM4QBPd_ .

Toggi3 commented 6 years ago

I ask that you give us just enough rope to hang ourselves if we so choose. We understand the spriit of your project, we just want a way out of its limitations that is easy for us to use.

alshabib commented 6 years ago

+1

-- Ali Al-Shabibi

On January 13, 2018 at 7:22:08 PM, Toggi3 (notifications@github.com) wrote:

I ask that you give us just enough rope to hang ourselves if we so choose. We understand the spriit of your project, we just want a way out of its limitations that is easy for us to use.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/openfaas/faas/issues/320#issuecomment-357481040, or mute the thread https://github.com/notifications/unsubscribe-auth/AA5rrB8Dqc5Ot6ULAFvZleYCx9Npuegzks5tKVbAgaJpZM4QBPd_ .

alexellis commented 6 years ago

Using object storage is simple and fast. I'd encourage anyone on this thread to try our recommended approach before pushing back and insisting on volume mounting.

Minio / Ceph are both super easy to setup on Docker or Kubernetes:

https://minio.io

Once you have the daemon running you can access it in the cluster and create buckets / push/fetch objects via a client library.

Here's the Python library for instance:

https://docs.minio.io/docs/python-client-quickstart-guide

I'm planning on providing a small sample function but in the meantime there's our colorisebot which you can check out for inspiration.

Toggi3 commented 6 years ago

It's an appealing solution, and I appreciate it for sure, the only problem with it is the preexisting infrastructure and scripts that depend on a volume being there. I will definitely take a look in any case, but I won't have time to refactor the many jobs to use s3 when they currently use volumes. Many of us are trying to use newer tools like this to simplify older systems, and while I might be able to sell my boss on the philosophy of why s3 might be better, there is simply too much work to do to scale such a mountain of technical debt just to appease a design preference.

Unfortunately, it appears fission isn't better in this regard. I might have to kludge something stupid together with Jenkins to kick off runs I guess... Any other input welcome as volumes are a must at least for the intermediate. Thank you for your work even if we couldn't come together on this problem. It's a good project.

RawSanj commented 6 years ago

@alexellis So if I deploy Minio on K8S and have NFS as PersistentVoume for Minio and then store files in Minio from Functions, I will essentially be able to access NFS from the Functions.

Is that correct? And would it be a good idea to do this?

alexellis commented 6 years ago

@raw I'm not sure you've read the new blog on using Minio & OpenFaaS?

alexellis commented 6 years ago

@Toggi3 If you have a binary that reads a file from the filesystem, then copy the file with Minio's mc command into the right location, do your work then use mc cp to copy it where it belongs are that. From your description I can't see any hard dependency on volumes.

RawSanj commented 6 years ago

@alexellis I'm sorry, which blog are you talking about? The Getting started OpenFaas on minikube? Or is there any other blog for Volumes in OpenFaaS?

alexellis commented 6 years ago

All - I've written a blog post on how to use object storage (S3) with OpenFaaS - just like you have to do with AWS Lambda, Azure Functions or similar. You have client libraries available for most programming languages including a binary client for bash. It's very easy to use and setup:

https://blog.alexellis.io/openfaas-storage-for-your-functions/

The performance is also good. This powers Colorisebot on Twitter.

Toggi3 commented 6 years ago

What if that thing is a gpg encrypted archive that is >10GB that has to be decrypted then untar'd, dumps out a ton of proprietary audio files from a call interaction system that have to be transcribed into digestible pcm wav and csv metadata by another process and stored back on the volume for another process to pick up?

I have to first wait for a copy operation from s3, do my operations, then copy it back? Too much time. I already have to sometimes pull these things from remote SFTP, S3, google drive, locations over the internet and I am targeting 24 hour turnaround for jobs like these every day, end-to-end. We don't choose how our payloads are constructed, or even necessarily how they are delivered, because we aren't the producers of them. Some of these payloads are not nice to work with at all. Our customers pay us to worry about that problem for them.

alexellis commented 6 years ago

@Toggi3 you'd have the same problem(s)/issue(s) with an underlying NFS filesystem. Moving a 10GB file around on the network is a very specialist problem.

Toggi3 commented 6 years ago

Over the weekend I might try to do as you suggest and compare performance. I agree I have a very specialist problem, for which I have been seeking out specialist solutions like docker functions...

mycodecrafting commented 6 years ago

So a persistent posix compatible volume introduces too much state in the system, but a persistent object store does not? That doesn't even make any sense. State is state.

But let's say that's a valid argument for argument's sake. There is also the fact that there's not an object store out there that can compete with the performance of distributed parallel filesystems designed for high performance computing.

Or that very few real world applications can easily or efficiently interact with an object store. Not everyone is working with brand new shiny applications. Very few people are. Most of us have to deal with legacy applications, and work to slowly change them over time while we dream of rewriting it in the always distant "someday."

Most of us also rely in some way on 3rd party libraries and apps, and again most of those cannot easily or efficiently interact with an object store.

Copy the file down and back up? If functions are indeed supposed to be short-lived, then suggesting that they should spend 2/3 of their runtime performing network transfers is rather silly. It's also a massive waste of CPU resources and time. Now we're required to have a substantial amount of additional resources in order to perform the same amount of tasks in the same time as we could if we could just grab the data off of a volume.

But let's just say we're fine with copying the file down and back up. What about operations where we need large amounts of disk space. Let's take video transcoding for example. We may need several hundred GBs or more of disk scratch space to perform the operation. We probably want to be able to run more than one function at a time on each server. And we're unlikely to have servers sitting around with several TB of local disk attached to each one, especially in the cloud. It's just cost-prohibitive. But we are probably more likely or inclined to have a large high performing distributed filesystem mounted on each one. Here's an example where we want the mount not for state at all (remember we are assuming here that we're fine with copying a massive file down and back up), but just for temp/scratch space in order to carry out the function.

Don't get me wrong, I'm a big fan of this project. And I can admire your dedication to the principles of the project. But the world isn't as black and white, and there are a whole host of people that you're shutting the door to the project on because they can't do something as simple as bind a mount. The door is shut on anyone with any kind of legacy application that they want to start to use something like this for. The door is shut on anyone with an application that has "a very specialist problem." The world is very specialized, and there are a whole lot of specialized applications. You're excluding a lot of people from benefiting from this project over what is such a small request.

It's your project, so do what you will, but at the end of the day nobody is asking for anything that docker doesn't already do. All that is being asked is that people be able to utilize an existing basic feature of docker.

Let's go back to the beginning (state) for fun. Functions can connect to any external service they want, databases for example. Is a mount point really going to encourage stateful behavior more than a database connection does? I don't really think so. You don't prevent a function from interacting with the world outside of it -- most of which is stateful. I don't see how a volume is fundamentally any different.

nicholasjackson commented 6 years ago

Personally I think mounted volumes are an anti-pattern in the world of containers, it produces a simpler system to manage and administer if you treat containers as they were intended as immutable. Also from a personal perspective managing storage is a pain in the butt, offloading this to S3 where you get encryption and replication out of the box is always the best option however I agree this is not applicable in all cases.

For things like video encoding, I would argue that OpenFaaS or any serverless platform is not the right choice, and companies like AWS have created specific media encoders for this purpose.

Back to Volumes however, conceptually, functions should ideally be immutable and independent units of work. Amazon laid this pattern down with Lambda predominately due to the multi-tenanted nature of the system but also operational simplicity. Introducing volumes into a function would mean that a function is no longer an independent unit of work, it would have a hard dependency on the underlying node being able to provide storage and therefore the configuration of the platform.

One of the beauties of OpenFaaS for me is that once I spin up my cluster and install the gateway, I just can deploy functions with no further tinkering. Creating a hard link between a function and configuration on the cluster to provide volumes would break that.

Maybe we just have to accept that FaaS has it's limitations and is not applicable for all cases but it is these limitations which makes it so easy and pleasant to use for the 99% of normal cases.

alshabib commented 6 years ago

I view this feature as analogous to a pass-by-value vs. pass-by-reference capability of a programming language. In one case, you pass the actual input data directly and in the other you just pass a pointer to where the data resides (on a distributed FS in this case).

Also I do not think that such a feature would change the principle that a lambda (or function) is immutable or an independent unit of work. All we are talking about is the ability to pickup input data from other place(s) than STDIN.

All that said, I agree with the purist view of lambdas/functions but unfortunately the world is not perfect and sometimes some rules need to be bent.

Toggi3 commented 6 years ago

if this is an absolute must for people I suggest fission.io which has this issue on their roadmap. Unfortunately I think kubernetes is a hard constraint at this time.

m4r10k commented 6 years ago

@alexellis I would like to mount the docker socket file, as it would be nice to use OpenFaas as function driver for different read Docker Api read functions. I know I can work around it by using the tcp interface of the Docker managers, but that adds complexity with less to no benefit. I would favor to allow bind volume mounts.

alexellis commented 6 years ago

@Toggi3 I think you might be mistaken there. Fission only mentioned this once in 2016 where they also state they cannot support config maps, secrets, cpu or memory constraints - these are all things which OpenFaaS already supports. FYI: Their project will only ever target K8s.

So far I've not seen anyone who is complaining about the lack of Volume support rolling their sleeves up and trying the reference architecture with Minio or S3. It's proven to work and it's performant, it has a command-line utility and lots of SDKs for various languages.

https://github.com/openfaas/faas/issues/320#issuecomment-360513338

What I'd like to see is people trying the solution and providing concrete reasons why it could never work for their use-case - and this not being a 1% demographic - but a majority. There is a cost to developing, documenting and maintaining every feature on OpenFaaS. Persistent storage is already possible and well documented.

@m4r10k as for mounting a Docker socket - this is definitely an anti-pattern and it's very unlikely that we would do that. It's better for you to expose the daemon over TCP with TLS - that needs no changes to OpenFaaS to support your use-case.

@Toggi3 If you really are encoding 10GB of video data then I'm not sure this is the right solution for you right now. That really needs something which is not bounded by time like a specialist batch job processing system. I have an alternative project which runs ad-hoc containers/tasks on Docker Swarm - you are welcome to donate time or code to that project to enable bind-mounts there: https://github.com/alexellis/jaas

Toggi3 commented 6 years ago

I haven't forgotten @alexellis. I fully intend to try Minio / S3. I did try S3 as provided by a cloud provider, but the speed was too slow for the task. I have some hope minio will be better than the cloud provider S3 but I have to get a real hardware lab together to really evaluate it. Even if it is perfectly acceptable, I have many custom built tools that do similar things for different data sets and they all depend on a volume.

That is not to say rearchitecture of future jobs is unreasonable, but the amount of rework involved pretty much kills using the dozens of old jobs that I was hoping to just balance over a swarm. Each client has very different needs. I have more audio data than video, a lot of metadata, and it comes to us in many different formats or encryption methods. I was hoping to move these containers wholly into functions, and then start building new functions that chain loaded others to do things in parallel with a shared volume.

I recently spoke with the people at fission, they appear to want to have some functionality built first before building in that function, but it is still intended.

Thank you for the link on jaas. I will explore. I hope I don't personally come off as though I am trying to rustle your jimmies or I am unappreciative, on the contrary your project is cool and I respect your direction even if I disagree with it. It feels like an arbitrary self limitation on a technology intended to leverage something that doesn't have that limitation. Containers are not 'supposed to be' anything, nor are they not supposed to be something. Things can have many uses outside of how people originally envision them and I don't see that as a problem but something to enable within reason when it is desired. I only have so many hours in the day and I'm not going to go to my boss and tell him the philosophy of containers is the reason we should refactor all this stuff that currently works.

justinfx commented 4 years ago

I would like to follow up on this feature request. Honestly its something that is causing me to hit a wall with introducing Openfaas into our current pipeline and offering a migration path from our current VMs and shared process manager approaches to deploying small arbitrary services and event handlers for users. While I understand that it is considered an anti-pattern to rely on mounted volumes for state and configuration in containers, it is also very limiting for cases where it is needed. In our case, our studio has enormous amounts of existing tooling and code where dependencies will read from the filesystem for configs that have not been migrated to something like Consul. So we have two use cases where we need our existing pipeline environments to work inside a container:

  1. read-only mounts for dependencies and filesystem configs
  2. read-write mounts for functions and services that need to perform some kind of data transformation

Sure it would be great if all of our code were updated to pull configs from consul, could be 100% packaged as standalone in a container, and do any filesystem data transformations through an object-store api. But we aren't there yet and the transition would be slow. We definitely want to get to this point though.

Furthermore, Openfaas states that it officially also supports long-running microservice workloads in addition to short-lived functions. So to say that Openfaas only focuses on faas patterns doesn't seem to align with that extended support? I feel it would be ideal to enable users to solve their problems, even if it means they have to enable the feature and that there are warnings and notes around the pattern as being less than ideal. In our case, it would really help transitioning our 15+ year old pipeline.

It seems Fission supports Volumes now in their pod spec: https://docs.fission.io/docs/spec/podspec/volume/

But honestly, I want to use Openfaas. I've already prototyped custom templates for a facility private template store. I have written some patches to faas-cli to support nested template store paths within a repo. I like the faas-cli functionality and how it unifies deploying functions and services. But my one sticking point is the volume mounting limitation. Is there really no value in providing volume mounting at this point, even when competing frameworks like Fission seem to see value in providing it? Could it maybe be a feature that has to be enabled at the openfaas deployment configuration level to opt into the support?

As a semi-related annecdote, I maintain the build and deploy system for our code at my studio. It happens to be an extension of the Waf build system. Now the maintainer of the Waf build system is extremely opinionated about what should and should not be allowed in the build process for a user, which has led to some feature requests or pull requests being denied. In which case, they end up being an extension added to our build system instead, because we need to enable users to solve their problems. There may not be something directly provided as a 1st class concept in our build system layer, but then we still enable users enough flexibility to do what they need to do to solve their problems. They may need to opt into a feature that is documented with caveats or opinions.

feiniao0308 commented 4 years ago

Actually, I have similar request before https://github.com/openfaas/faas/issues/1232 .

justinfx commented 4 years ago

Yes @feiniao0308 it sounds like my situation as well, where we have a studio with tons and tons of library and application versioned deployments to various nfs mounts. They are deployed frequently by many teams. It is currently not feasible for us to fully package them into a container as we aren't 100% able to trace all the dependencies. Some libraries link against other libraries, so you have to resolve the entire dependency chain, even looking at the RPATH of linked libraries, etc, etc. Exposing the NFS mounts via minio to the container still puts the responsibility on the code in the container to know the entire dependency chain to pull in and put into correct places in the container filesystem. Its not ideal, but its what we have right now.

feiniao0308 commented 4 years ago

Exactly. I hope OpenFaaS could expose the mount option. When I search mount keyword in the issues, I did see many similar issues.

justinfx commented 4 years ago

I've confirmed on each of their slack channels that both Fission and Nuclio support full expression of volume mounting in their yaml specs. It would be really awesome if Openfaas would match the support.

feiniao0308 commented 4 years ago

Not sure if OpenFaaS will support to expose the mount option and let function owner to make the decision. It'll make OpenFaaS more flexible if the function has this option exposed. @alexellis @justinfx

justinfx commented 4 years ago

@feiniao0308 I've already got NFS mounts working in the PodSpec, in Fission.io functions.

feiniao0308 commented 4 years ago

@justinfx you add extra steps to update function pod spec after it's deployed? How do you make it work?

justinfx commented 4 years ago

@justinfx you add extra steps to update function pod spec after it's deployed? How do you make it work?

Not to get too off topic about another project, but you just use their --spec flag to generate all the yaml. Then you customise the PodSpec once. And then you can deploy with fission spec apply.

feiniao0308 commented 4 years ago

@justinfx thanks for the info. I'll check that project. Thanks!

pyramation commented 4 years ago

just +1'ing this as a feature that would be great to have. I've read everyones arguments in this issue as well as this one https://github.com/openfaas/faas/issues/1178 and I think it's fair to say this would be a great option.

justinfx commented 4 years ago

I was reading about Hashicorp Nomad and the integration with OpenFaas via the faas-nomad provider. On the topic of volume mounts... Nomad provides support for legacy command deployments that can't easily be wrapped into a container, as well as the volume mounting for all job types. This suits my work environment quite well as a way to get tons of legacy code up and running from departments that don't have the resources to focus on containerised solutions. That being said, is there some way to pass through the nomad job/group volume options with this approach? Or is all of that control still abstracted away at the openfaas faas-provider layer? I thought maybe I could still use openfaas if we picked up nomad and could mount our shared nfs volumes or host mounts. Looking for any solution to this proposal. Because so far I have needed to commit to using Fission in my prototype work, for the volume mounting support.

alexellis commented 4 years ago

@pyramation what is your specific use-case, and have you tried object storage yet?

alexellis commented 4 years ago

@justinfx can you detail what your function does? It is likely that you can use object storage such as Minio or build data into the container image.

justinfx commented 4 years ago

Hi @alexellis . I work at a large visual effects studio, with many years of legacy code making up the pipeline. In addition to traditional core and pipeline developers we have hundreds of artists with some level of coding skills that are capable of scripting tooling around their primary digital content creation packages (Autodesk Maya, Foundry Nuke, SideFx houdini, ...). Our common way of interacting with project data is through an complex abstraction on top of NFS mounts and filers. Layers are built on layers, with applications and libraries that write to the file system. So here would be a hypothetical example case. A technical artist from a department responsible for simulating muscles over a an animated skeleton rig wants to monitor for the event of that animated rig having a newer version published. In response to this publish event, a function should fire that looks up the asset locations of the new animated rig, resolve their location on the nfs file system, open a scene file in another location on the file system, reimport the new rig version, version up the scene file on disk, and then submit a new muscle simulation version to our render farm (let's just pretend this whole version up, validation, and submission takes <60s). So the interactions with the nfs file system mounts are important here, for all the existing code responsible for dealing with asset management, and the animation package where the simulation is loaded and rendered, and the output files being stored. A lot of code from the ground up would have to be rewritten to go through an object storage API to make use of Minio (as far as I understand). Not to mention that we don't even have control over the 3rd party applications that don't know how to use the object storage API. Another issue that we have is our environment management system, which we use to combine many many versions of software together to run on different projects, and even modifications on a per scene or shot basis to software versions. All of this software is currently stored on our nfs mounts. Now we have the future goal of being able to 100% containerize our applications and services, but we aren't there yet. We can do it with our services, but not yet with many of our applications. So our current solution for these cases is to mount the software deploy locations so that dependencies can be picked up. We can slowly transition to having a better containerization story across the board, but we can't instantly convert to going through an object storage API in all cases. My point is that preventing the NFS volume mounting on a principle issue is just limiting our access to this framework while we try and migrate to newer solutions.

pyramation commented 4 years ago

@justinfx that's interesting! so if I understand correctly, some assets are so large that it's better to have functions dynamically attach themselves to read (and potentially write) to the drives to perform an operation, vs having to download them over a wire each time. (p.s. I used to work for SESI w/houdini)

@alexellis my number one use case right now is developer experience for creating openfaas functions, particularly hot-loading. I'm using kubernetes and openfaas and if, during development, I could hot-load my code, I would save quite a bit of time that I normally sit there building the docker images for every code change. In the case of nodejs it would save me a up to minute for every code change. Even when using python, it can feel like a larger-than-needed compile step for any code change, compared to if we had a hot-loading volume the changes take milliseconds - Simon did write something for docker-compose https://gitlab.com/MrSimonEmms/openfaas-functions/-/blob/master/docker-compose.yaml#L12 but I would be great if there was a solution for k8s.

justinfx commented 4 years ago

@pyramation yes I should have put a little more effort into focusing on the data question from @alexellis. We generate lots of data. It would not be uncommon for a simulation to generate 1TB of temporary or intermediate data. Our pipelines are about transforming data until ultimately we produce final pictures for a movie. So the idea of using functions in our pipeline would be to respond to async events and perform transformations on arbitrary data. Some work would be too time consuming and need far too many resources to be done in a function invocation, in which case we would just use functions to trigger jobs in our render farm. But there is plenty of work to be done in event handlers where we need access to image data, simulation data, scene files, and applications and libraries that may have no support for an object storage API. We need the flexibility to support these workflows, even if ultimately it would be better to do what we can through an object storage api that maybe proxies to our nfs file system.

justinfx commented 4 years ago

@alexellis have you had time to consider my last replies to your question as to why a Minio solution would not be sufficient? I would like to know if your position is firm on this and we cannot expect Openfaas to ever allow any kind of mounts (nfs, hostPath, config map). Or if maybe with the amount of support for this feature request, possibly your position has softened to where it could be a opt-in configuration option in the deployment of Openfaas? I feel that there have been enough replies to your request for justification of the feature that it warrants some kind of support to bring this project in line with the same offering in other frameworks.

funkymonkeymonk commented 2 years ago

Hey folks. I wanted to bring this up again because by not allowing access to volumes, functions are not able to communicate with the GPIO pins using /dev/mem. This is a problem for me and the work around is to run your containers in privileged mode which feels like an even worse idea that possibly allowing state in containers. Given the pi is an explicit deploy target and IoT is an explicit use case this seems like a miss. Is there a work around here that I'm missing?

justinfx commented 2 years ago

@funkymonkeymonk it seems clear that Openfaas has a hard stance against allowing mounts. But a workaround for this limitation is to use mutating webhooks in kubernetes, which would let you do something like an annotation declaring a need for a mount, and the webhook can mutate the spec and add the volumes. You could either write and deploy a mutating web hook manually, or implement it in something like Open Policy Agent.

funkymonkeymonk commented 2 years ago

Thanks for the thought. Unfortunately I am using faasd to avoid having to run k3s so kunernetes based solutions require a full rethink and honestly if I'm going that route I'll likely look at alternatives instead.