Mount S3 inside a Lambda function

awslabs / mountpoint-s3

A simple, high-throughput file client for mounting an Amazon S3 bucket as a local file system.

Apache License 2.0

4.57k stars 157 forks source link

Mount S3 inside a Lambda function #656

Open kevmcgrath opened 11 months ago

kevmcgrath commented 11 months ago

Tell us more about this new feature.

The National Weather Service would find great utility in being able to mount an S3 bucket inside a Lambda function. Fantastic work, team! The current capabilities are already very useful.

Please upvote this feature request using the 👍 reaction to the main post. This will help us with prioritization.

sauraank commented 11 months ago

@kevmcgrath ! Glad that you find mountpoint useful in your workload. Thanks for opening feature request. We will investigate into this feature.

sladg commented 10 months ago

Interested in this as well. This binary is Graviton compatible, but requires Fuse, which is probably not going to have access to underlying functions in Lambda.

sfwhite commented 9 months ago

I'm a Solution Architect and I'd be interested in working this into the serverless strategy for my department as well.

In general, we try to default to a serverless approach for the cost savings benefits and ease of scalability and availability, so the ability to mount a bucket to the runtime and interact with it as a native FS would be a significantly better developer experience for my teams. This is especially true when many of my developers are coming from an on-prem world, as this is less of a cognitive shift for training. We also heavily use CDK for infra, so mounting the bucket to the lamdba during provisioning, means less the developers have to care about the underlying infra while working on the business logic.

snowzach commented 5 months ago

I have another possible use case for this.. I would like to open a very large image file... Larger than Lambda has ephemeral disk space so I can read image metadata... The problem is that some of the data is at the beginning of the file, some of it is at the end of the file.. So I maybe need to read 30k of a 15GB files. Being able to SEEK would negate any disk space issues as well as paying to download the entire file.

rberger commented 4 months ago

I am trying to find out if its possible to use Mountpoint S3 in a Lambda and what it would mean for our use case:

We are running ML models in lambdas. They have both very large python dependencies and the model themselves. We can load the model into the lambda using normal S3 API calls. But we would also like to load in all those Python dependencies (I'm looking at you Pytorch). If we could have them all in S3 and mount them with Mountpoint S3 so they just get read in by the Python code, that could be awesome (assuming performance is good).

We are currently doing this with EFS, but we found that if we have a huge input spike that triggers a lot of Lambdas to scale up all at once, we can easily saturate the EFS file system with peak load. So looking for an alternative to EFS. Also EFS is expensive.

jawadqur commented 3 months ago

This would be a huge win for AWS if they were able to implement this! Would open up a plethora of use-cases. Following this thread :)