delta-io / delta-examples

Delta Lake examples
Apache License 2.0
208 stars 76 forks source link

Adding SageMaker Studio Notebooks #26

Closed ari-vedant-jain closed 1 year ago

ari-vedant-jain commented 1 year ago

SageMaker Studio notebooks connect to EMR for Munging Delta Lake Data & run ML training using Built-in Algorithms with Model Registry.

ari-vedant-jain commented 1 year ago
  1. Yes, I can remove the license.
  2. The first notebook is an adaptation of the Databricks notebook, which is mentioned in the description; I can update it so it's more visible. The second notebook is using SageMaker built-in algorithms and is a new addition to the repository.
  3. The dataset is under /databricks-datasets/samples/lending_club/parquet/. Is there a way to access this data from outside databricks, for example using the "s3" prefix?

Thanks, Vedant

On Tue, May 16, 2023 at 1:58 PM Denny Lee @.***> wrote:

@.**** requested changes on this pull request.

This looks awesome @vedantja https://github.com/vedantja - a couple of small call outs:

  • Could you update the notebooks to attribute the original source notebook/blogs
  • Could we not include the MIT license as the repo fall sunder Apache-2.0 license Thanks!

— Reply to this email directly, view it on GitHub https://github.com/delta-io/delta-examples/pull/26#pullrequestreview-1429223159, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABQIHFIWJ6GJU3MGJECM743XGPE7FANCNFSM6AAAAAAYCXFH7I . You are receiving this because you were mentioned.Message ID: @.***>

-- Vedant Jain M: 832.348.0096 @.***

dennyglee commented 1 year ago

Thanks for 1 and 2.
For 3, I can probably prop this data into Github unless it's okay to do a payer S3 account?

ari-vedant-jain commented 1 year ago

Either way is fine with me. I think databricks-datasets is a DBFS mount point to an existing S3 bucket. If we can just get the S3 URI for that bucket, we don't have to do either.

On Tue, May 16, 2023 at 6:59 PM Denny Lee @.***> wrote:

Thanks for 1 and 2. For 3, I can probably prop this data into Github unless it's okay to do a payer S3 account?

— Reply to this email directly, view it on GitHub https://github.com/delta-io/delta-examples/pull/26#issuecomment-1550501222, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABQIHFMBUEQH65FNVERECSTXGQIFFANCNFSM6AAAAAAYCXFH7I . You are receiving this because you were mentioned.Message ID: @.***>

-- Vedant Jain M: 832.348.0096 @.***

dennyglee commented 1 year ago

@vedantja Can you please sign your commits and we can then merge - thanks!

ari-vedant-jain commented 1 year ago

Sure, will do that tomorrow.

On Wed, May 17, 2023 at 13:28 Denny Lee @.***> wrote:

@vedantja https://github.com/vedantja Can you please sign your commits and we can then merge - thanks!

— Reply to this email directly, view it on GitHub https://github.com/delta-io/delta-examples/pull/26#issuecomment-1551870917, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABQIHFP4EA3M47SQUU4PF63XGUKD3ANCNFSM6AAAAAAYCXFH7I . You are receiving this because you were mentioned.Message ID: @.***>

-- Vedant Jain M: 832.348.0096 @.***

ari-vedant-jain commented 1 year ago

Closing pull request. Will create a new one with signed/verified commits.