aws / studio-lab-examples

Example notebooks for working with SageMaker Studio Lab. Sign up for an account at the link below!
https://studiolab.sagemaker.aws
Apache License 2.0
624 stars 181 forks source link

Add S3 bucket connectivity or integration into SageMaker Studio Lab #178

Open taureandyernv opened 1 year ago

taureandyernv commented 1 year ago

Is your feature request related to a problem? Please describe. I'd like to be able to temporarily connect different S3 buckets to SageMaker Studio Lab (SMSL) for either extra library storage or dataset storage. I'm currently having trouble installing the latest RAPIDS version on SMSL due to running out of space, and have resorted to deleting datasets that take a long time to download. I even have to delete the zip files. The downloads, conda environment recreation, and failed installs take up precious GPU time that is already hard to come by.

Describe the solution you'd like I'd like to be able to connect an s3 bucket to SMSL with the conda envs and data ready to go.

Describe alternatives you've considered

Additional context Some additional libraries for GPU deep learning and machine learning take up a bit of room when downloading and expanding using conda.

icoxfog417 commented 1 year ago

@taureandyernv , thank you for the feedback! I understood to evacuate existing conda environment and dataset and back these after the operation is painful.

I felt "mount S3 to Studio Lab" experience will be what you need. Does goofys, s3fs match your need? (For now, we can not install it because apt install is required now.)

MicheleMonclova commented 1 year ago

@taureandyernv, as your ML experiments get too big for Studio Lab you may want to consider launching your notebooks in Sagemaker Studio. We tried to make this easy to do with a new feature called Notebook jobs (just released last December 22). With Notebook jobs, you can work on your notebook in Studio Lab but then schedule it to run in SageMaker (you will need an AWS account). The job will kick off and shut down when complete. Yes, there will be some cost, but depending on the instance type you select it may be negligible. Check out this blog here and let me know what you think: https://aws.amazon.com/blogs/machine-learning/run-notebooks-as-batch-jobs-in-amazon-sagemaker-studio-lab/

fkunn1326 commented 1 year ago

Are there any plans to allow mounting of external drives such as S3 in Studio Lab? I am a student and cannot use Sagemaker.

icoxfog417 commented 1 year ago

@fkunn1326 thank you for the comment. For now, we have not had the specific plan to mount the S3. You aren't allowed to use SageMaker in your school AWS account?

icoxfog417 commented 11 months ago

I think installing Mountpoint S3 to Studio Lab will work (it requires sudo so users can not install for now). https://github.com/awslabs/mountpoint-s3